calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <>
Subject Re: Filter push
Date Thu, 02 Oct 2014 18:50:32 GMT
Glad you found the Mongo adapter. It’s definitely closer to what you want.

Questions such as [1], and also Andrew Selden’s experience working on an Elasticsearch adapter
[4] have made me think that an interpreter [5] might be useful, so you can execute queries
without converting expressions to java strings and back again. There is a partial implementation
already. Would an interpreter be useful to you?

On Oct 2, 2014, at 10:17 AM, Dan Di Spaltro <> wrote:

> For instance in rocksdb
> everything besides the primary key is a table scan [2].  And it works
> like a cursor, you just iterate over the values.  Ideally during that
> iteration you could apply the simple filtering.

By the way, HBase works in a similar way. It is an ambition of mine (and James Taylor’s)
to find a way to make bring Calcite and Phoenix together somehow.
> Like I mentioned above this is where I am getting tripped up, since
> it's such a basic datastore, I am having a hard time grokking how to
> express that.
> I was thinking of using janino to compile to a java expression and
> passing that to the iteration engine, but that is going to take some
> time.

What is the Java API to RocksDB? I found [6] and RocksDB [7] and RocksIterator [8].

One way to think about this is to choose a reasonably challenging query, implement it by hand
(post the java code to this list) and then we’ll figure out how to generate that code (or
generate calls to a helper class that has the same effect).

If for example the query is “select … from emp where id between 10 and 20”, my guess
is that you’d write

RocksDB db = …;
RocksIterator iter = db.iterator();
bytes[] start = toBytes(10);
bytes[] end = toBytes(20);;
while (iter.isValid()) {
   bytes[] k = iter.key();
   if (compare(k, end) > 0) {
   bytes[] v = iter.value();
   // emit (k, v) somehow;

Then you need to package that as an Enumerable.

Then generalize it into a scan that can take start value, end value of various types.

>> Create a RocksConvention, a RockRel interface, and some rules:
>> RocksProjectRule: ProjectRel on a RocksRel ==> RocksProjectRel
>> RocksFilterRule: FilterRel on RocksRel ==> RocksFilterRel
> As an example thats what's this is conveying right [3]?


>> ArrayTable would be useful if you want to cache data sets in memory. As always with
caching, I’d suggest you skip it in version 1.
> I wasn't sure if I could subclass it and use the interesting bits
> since rdb deals with array of bytes, but since serialization isn't
> what I am confused on Ill skip this question.

Yeah, ArrayTable needs things to be in its own particular format. Not appropriate for what
you want.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message