calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <julianh...@gmail.com>
Subject Re: Pushing a join condition below a LogicalCorrelate
Date Tue, 12 May 2015 02:06:38 GMT
Seems a bit of a stretch, since Join has other ways to represent SEMI and ANTI. Maybe a Correlate
could have both a JoinType and a SemiJoinType?

Can you & Vladimir find a compromise for how to restore the missing functionality with
no more copy-paste than necessary. It would help if we had a full list of rules which ought
to work for Correlate.

Julian

On May 11, 2015, at 5:27 PM, Jinfeng Ni <jni@apache.org> wrote:

> Can we extend Join.JoinType, so that it includes the SemiJointype (SEMI,
> ANTI) represented by Correlate? That way, we could leverage the rule for
> Join and apply them to Correlate as well, just like the way it used to
> work. Otherwise, we have to come up with a new set of rules for Correlate,
> to make thing work again.
> 
> 
> 
> On Mon, May 11, 2015 at 5:02 PM, Julian Hyde <julian@hydromatic.net> wrote:
> 
>> This comment in Correlate seems to express Vladimir’s motivation:
>> 
>>> Correlate is not a join since: typical rules should not match Correlate.
>> 
>> I agree with him. For instance, Correlate.joinType is enum SemiJoinType {
>> INNER, LEFT, SEMI, ANTI } and therefore different semantics to
>> Join.joinType.
>> 
>> It’s unfortunate that FilterJoinRule got broken. We should fix it. Any
>> other rules that would be needed? Probably ProjectJoinTransposeRule,
>> AggregateJoinTransposeRule.
>> 
>> Julian
>> 
>> 
>> On May 11, 2015, at 4:17 PM, Aman Sinha <asinha@maprtech.com> wrote:
>> 
>>> As part of CALCITE-483,  the class hierarchy of CorrelateRel was changed
>>> such that the new LogicalCorrelate is not a derived class of Join
>> anymore.
>>> Thus, any rule such as FilterJoinRule that used to push the filter down
>>> into the Join (or a derived class of Join) does not apply anymore for the
>>> LogicalCorrelate.
>>> 
>>> I am continuing down the path of my proposal to  have a version of the
>> push
>>> filter rule that allows pushing into/past a LogicalCorrelate.  But
>> perhaps
>>> Vladimir can shed some light on the motivation for changing the class
>>> hierarchy.
>>> 
>>> thanks,
>>> Aman
>>> 
>>> 
>>> On Mon, May 11, 2015 at 10:21 AM, Aman Sinha <asinha@maprtech.com>
>> wrote:
>>> 
>>>> Note that I have made some changes to the decorrlation logic to call
>>>> findBestExp()  *after*  the decorrelation is done and supply it the set
>> of
>>>> rules including FilterJoinRule.  This does push the join condition into
>> one
>>>> part of the tree but it does not push it into all other parts where that
>>>> join may have been copied during decorrelation.    The main point is:
>> we
>>>> need to do the filter pushdown early rather than late.
>>>> 
>>>> Aman
>>>> 
>>>> On Mon, May 11, 2015 at 10:16 AM, Aman Sinha <asinha@maprtech.com>
>> wrote:
>>>> 
>>>>> I want to be able to push the join condition (=($7, $9)) highlighted
>> into
>>>>> the LogicalJoin that is right below the LogicalCorrelate.  What's the
>> right
>>>>> way to do it ?
>>>>> 
>>>>> The current method of first decorrelating and then pushing the filter
>>>>> (via the FilterJoinRule) is not quite right because once decorrelation
>> is
>>>>> done, it may be too late to push the filter into the join.  During
>>>>> decorrelation we take that LogicalJoin (with its TRUE condition) and
>> push
>>>>> it into other places - for instance we call createDistinct() to build
a
>>>>> distinct row set on the result of this join but since the join has a
>> true
>>>>> condition, the distinct is created on a cartesian join.
>>>>> 
>>>>> What I really need is something like a FilterJoinRule that allows
>> pushing
>>>>> it past a LogicalCorrelate.
>>>>> 
>>>>> LogicalProject(EXPR$0=[1]): rowcount = 1.0, cumulative cost = 10.25,
>> id =
>>>>> 53
>>>>> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3],
>>>>> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8],
>>>>> DEPTNO0=[$9], NAME=[$10], EXPR$0=[$11]): rowcount = 1.0, cumulative
>> cost =
>>>>> 9.25, id = 71
>>>>> *   LogicalFilter(condition=[AND(=($7, $9), >($5, $11))]): rowcount
=
>>>>> 1.0, cumulative cost = 8.25, id = 68*
>>>>>     LogicalCorrelate(correlation=[$cor0], joinType=[LEFT],
>>>>> requiredColumns=[{0}]): rowcount = 1.0, cumulative cost = 7.25, id =
61
>>>>>       LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0,
>>>>> cumulative cost = 1.0, id = 42
>>>>>         LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount =
>>>>> 1.0, cumulative cost = 0.0, id = 11
>>>>>         LogicalTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount =
>>>>> 1.0, cumulative cost = 0.0, id = 12
>>>>>       LogicalAggregate(group=[{}], EXPR$0=[AVG($5)]): rowcount = 1.0,
>>>>> cumulative cost = 2.125, id = 47
>>>>>         LogicalFilter(condition=[=($cor0.EMPNO, $0)]): rowcount = 1.0,
>>>>> cumulative cost = 1.0, id = 45
>>>>>           LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount =
>>>>> 1.0, cumulative cost = 0.0, id = 14
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Aman
>>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message