calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jinfeng Ni <...@apache.org>
Subject Re: Pushing a join condition below a LogicalCorrelate
Date Tue, 12 May 2015 00:27:43 GMT
Can we extend Join.JoinType, so that it includes the SemiJointype (SEMI,
ANTI) represented by Correlate? That way, we could leverage the rule for
Join and apply them to Correlate as well, just like the way it used to
work. Otherwise, we have to come up with a new set of rules for Correlate,
to make thing work again.



On Mon, May 11, 2015 at 5:02 PM, Julian Hyde <julian@hydromatic.net> wrote:

> This comment in Correlate seems to express Vladimir’s motivation:
>
> > Correlate is not a join since: typical rules should not match Correlate.
>
> I agree with him. For instance, Correlate.joinType is enum SemiJoinType {
> INNER, LEFT, SEMI, ANTI } and therefore different semantics to
> Join.joinType.
>
> It’s unfortunate that FilterJoinRule got broken. We should fix it. Any
> other rules that would be needed? Probably ProjectJoinTransposeRule,
> AggregateJoinTransposeRule.
>
> Julian
>
>
> On May 11, 2015, at 4:17 PM, Aman Sinha <asinha@maprtech.com> wrote:
>
> > As part of CALCITE-483,  the class hierarchy of CorrelateRel was changed
> > such that the new LogicalCorrelate is not a derived class of Join
> anymore.
> > Thus, any rule such as FilterJoinRule that used to push the filter down
> > into the Join (or a derived class of Join) does not apply anymore for the
> > LogicalCorrelate.
> >
> > I am continuing down the path of my proposal to  have a version of the
> push
> > filter rule that allows pushing into/past a LogicalCorrelate.  But
> perhaps
> > Vladimir can shed some light on the motivation for changing the class
> > hierarchy.
> >
> > thanks,
> > Aman
> >
> >
> > On Mon, May 11, 2015 at 10:21 AM, Aman Sinha <asinha@maprtech.com>
> wrote:
> >
> >> Note that I have made some changes to the decorrlation logic to call
> >> findBestExp()  *after*  the decorrelation is done and supply it the set
> of
> >> rules including FilterJoinRule.  This does push the join condition into
> one
> >> part of the tree but it does not push it into all other parts where that
> >> join may have been copied during decorrelation.    The main point is:
> we
> >> need to do the filter pushdown early rather than late.
> >>
> >> Aman
> >>
> >> On Mon, May 11, 2015 at 10:16 AM, Aman Sinha <asinha@maprtech.com>
> wrote:
> >>
> >>> I want to be able to push the join condition (=($7, $9)) highlighted
> into
> >>> the LogicalJoin that is right below the LogicalCorrelate.  What's the
> right
> >>> way to do it ?
> >>>
> >>> The current method of first decorrelating and then pushing the filter
> >>> (via the FilterJoinRule) is not quite right because once decorrelation
> is
> >>> done, it may be too late to push the filter into the join.  During
> >>> decorrelation we take that LogicalJoin (with its TRUE condition) and
> push
> >>> it into other places - for instance we call createDistinct() to build a
> >>> distinct row set on the result of this join but since the join has a
> true
> >>> condition, the distinct is created on a cartesian join.
> >>>
> >>> What I really need is something like a FilterJoinRule that allows
> pushing
> >>> it past a LogicalCorrelate.
> >>>
> >>> LogicalProject(EXPR$0=[1]): rowcount = 1.0, cumulative cost = 10.25,
> id =
> >>> 53
> >>>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3],
> >>> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8],
> >>> DEPTNO0=[$9], NAME=[$10], EXPR$0=[$11]): rowcount = 1.0, cumulative
> cost =
> >>> 9.25, id = 71
> >>> *   LogicalFilter(condition=[AND(=($7, $9), >($5, $11))]): rowcount =
> >>> 1.0, cumulative cost = 8.25, id = 68*
> >>>      LogicalCorrelate(correlation=[$cor0], joinType=[LEFT],
> >>> requiredColumns=[{0}]): rowcount = 1.0, cumulative cost = 7.25, id = 61
> >>>        LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0,
> >>> cumulative cost = 1.0, id = 42
> >>>          LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount =
> >>> 1.0, cumulative cost = 0.0, id = 11
> >>>          LogicalTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount =
> >>> 1.0, cumulative cost = 0.0, id = 12
> >>>        LogicalAggregate(group=[{}], EXPR$0=[AVG($5)]): rowcount = 1.0,
> >>> cumulative cost = 2.125, id = 47
> >>>          LogicalFilter(condition=[=($cor0.EMPNO, $0)]): rowcount = 1.0,
> >>> cumulative cost = 1.0, id = 45
> >>>            LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount =
> >>> 1.0, cumulative cost = 0.0, id = 14
> >>>
> >>>
> >>> Thanks,
> >>> Aman
> >>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message