calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <>
Subject Re: Filter push
Date Thu, 09 Oct 2014 13:27:35 GMT
I was wondering whether a simpler SPI, that allows you to get results
from a Table without generating code, would help. I just logged

* CursorableTable is an optional interface that can be implemented by
any Table that allows you to get the results directly, without code
generation, and without creating a TableAccessRel or similar. It
returns a Cursor, which is similar to a JDBC ResultSet but much
simpler to implement, and is more efficient than an Iterator or

* ProjectableCursorableTable goes further, and allows Calcite to
specify a list of projected fields and a list of filters. The cursor
must implement the projects, but it can choose which filters it is
able to implement.

Would these interfaces make it easier to implement your RocksDB interface?


On Tue, Oct 7, 2014 at 6:27 PM, Dan Di Spaltro <> wrote:
> Thanks for the response.  Here is my attempt to clearly explain the only
> push-down/optimization/shortcut (whatever it is) I am trying to do.
> I have two physical operations that the db api can do, get and scan
> (specifying a start). Since it is a simple key value store I am storing
> keys in a hierarchical fashion 1 level deep, as mentioned in the previous
> thread.
> I want to do one simple optimization, and that is if you specify what I
> deem is a "primary key" in the filter either through a between, in, OR's or
> whatever, I want to tell the physical db scan to seek.  That's really all I
> am trying to do outside of all the stuff Optiq gives me.
> I have a table scan that takes a start key and an end, and a list of
> projected columns (since it's only known at read time).  That produces an
> enumerable which maps directly to the physical iterator.  I can't quite
> work out in my head how to introspect the columns, figure out if it's one
> of the primary columns, add more metadata to the Scan call, then perform
> the normal operation.  That's where I am most getting tripped up.
> On Tue, Oct 7, 2014 at 10:21 AM, Vladimir Sitnikov <
>> wrote:
>> Dan,
>> >As always, a good example helps
>> Did you succeed with workable "select * from rocksdb_table"?
>> Can you share your code so conversation can become more specific?
> Yes I did, in a couple different increments.  Following the CSV type model,
> minus any filter push down, but with projection.  The more mongo-like
> structure where we define my own convention, but that didn't really get to
> what I wanted.
> It's just tough since I am not writing something that is generally useful,
> but I can try to put something up.
>> The calcite.debug code that you've posted recently has no rocksdb calls,
>> thus it looks wrong.
> I might have posted the wrong one, I've been playing with a lot of
> examples...
>> >Do you think this would make more sense to follow in the footsteps of the
>> >spark model, since it's more about generating code that is run via spark
>> >RDD's vs translating queries from one language to another (in the case of
>> >Mongo/splunk)?
>> Mongo/spark have their own query languages, thus those adapters
>> "translating
>> queries from one language to another" stuff to push more
>> conditions/expressions to the database engine.
> I guess I equated to Spark being "normal" code vs string translation. Like
> filter conditions per row operate in much of the same way as in
> reflectionschema,
>> As far as I understand, rocksdb speaks just java (there is no such thing as
>> rocksdb-language), thus I would suggest going with "translate to java calls
>> (rocksdb API)" approach.
> I tried to address that above.
>> You should have some good kind of aim.
>> "push down filters to rocksdb" is a wrong aim. Well, it might be a good aim
>> if you are Julian and you know what you are doing, but it does not seem to
>> be the case.
>> "make Calcite use rocks.get() api to fetch row by key given in this kind of
>> SQL" is a good one.
>> "display all rows from rocksdb as a table" is also a good aim.
>> The easiest approach from my point of view, is to use Calcite as an
>> intermediate framework that translates SQL to _appropriate_ calls of your
>> storage engine (see Julians approach earlier in this thread).
>> Calcite can glue together the iterations and fill in missing parts. For
>> instance, you can have "group by" implemented for free.
>> Does that make sense?
>> --
>> Vladimir
> --
> Dan Di Spaltro

View raw message