calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <jul...@hydromatic.net>
Subject Re: Review calcite-445
Date Sun, 22 Feb 2015 17:07:11 GMT

> On Feb 22, 2015, at 4:11 AM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
> 
> Hi,
> 
>> This change lays the foundations for streaming queries, such as being able
>> to query both the current contents of a table and future deltas to it
> 
> I am sorry, I cannot yet understand how BINDABLE projects/filters are
> the foundation for chi.
> 
>> to generate a different expression
>> for the two forms of the same table,
> 
> Can you pin-point the part of the code?
> As far as I can find, you always call StreamableTable.stream() before
> usage, so I cannot understand why do you need several expressions.

You can see my various previous attempts in the chi.2 branch. You can't understand how broken
it was unless you run testStream and try to get it working. At one point I went as far as
to add a "boolean stream" field to TableScan, which would have been awful.

The crux is this: EnumerableTableScan has to generate an Expression, and the expression is
generated by RelOptTable.getExpression(Class). Therefore when I called Table.stream() I was
forced to create a new RelOptTableImpl with the a new expression that somehow represented
the fact that I was asking for the stream rather than the base table.

Things only got better when I allowed RelOptTable.getExpression to return null. For a long
time Jacques has been pointing out that regular tables should not need to have linq4j expressions.

Longer term I would like EnumerableTableScan to work only on top of a QueryableTable. The
expression will come from the QueryableTable and RelOptTable.getExpression(Class) will become
obsolete.

>> more of the complex runtime logic (e.g. negotiating which filters and
>> projects to push down) into BindableTableScan.
> 
> You mean TableScanNode, don't you? (its create method is a bit too long)
> 
> It is interesting that TableScanNode can rescan _multiple_ times in
> case projects are too aggressive.
> It tuns out to be:
> 1) Functional issue: enumerable1 is not closed in case of rescan. This
> might lead to resource leak.

Yeah, I'm actually quite proud of those multiple attempts to find filters. It allows the ProjectableFilterableTable
to be flaky (changing its mind about which filters it can execute) but will always terminate,
and projects as few columns as possible.

Each "scan" creates an Enumerable, not an Enumerator. Enumerable does not acquire resources
and hence does not have a close() method.

> 2) Documentation/design flaw. It probably would be nice if PFT could
> return more projections than asked.

If PFT were able to negotiate projections it would be a different interface, and more complex
for users to implement.

> In general change in calcite-445 looks good, however as you put in
> javadoc of CalcSplitRule: "Not enabled by default, as it works against
> the usual flow". It is bad cost does not accomodate the number of
> pushed filters, etc.
> 
> Comments regarding the code:
> c1) TableScanNode seems to pretend it knows how to map Class to field
> list. I'm not sure if it is in line with table.rowType.
> https://github.com/julianhyde/incubator-calcite/blob/bd0b606178252c90805aff49851f4c9d5eea1a0a/core/src/main/java/org/apache/calcite/interpreter/TableScanNode.java#L180

This is emulating the behavior of ReflectiveTable and FieldSelector. I put it in to make existing
tests work.

> c2) ImmutableIntList#identity might probably be better expressed in
> terms of ImmutableIntList#range (that is avoid creation of int[])

I thought about having ImmutableIntList#identity create an AbstractList<Integer> (as
range does). But for code (such as TableScanNode.create2) that requires an ImmutableIntList
you'd have to write "ImmutableIntList.copyOf(ImmutableIntList.identity(x))". That is verbose,
less efficient, and ends up creating the int[] in the end.

Julian
Mime
View raw message