calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <julianh...@gmail.com>
Subject Re: About JoinCommuteRule
Date Mon, 23 Mar 2015 17:56:01 GMT
On Mar 20, 2015, at 3:02 PM, Maryann Xue <maryann.xue@gmail.com> wrote:

>>> I can't think of a good reason why JoinCommuteRule doesn't swap outer
> joins.
> 
> But right now the only call to swap() is with swapOuterJoins set to false.
> So I thought it might have some reason to do so. Can we change that?

I don’t remember why. Can you investigate, by running the test suite, and make a recommendation?

> 
>>> EnumerableJoin originally built the left, probed the right, and
> therefore had a smaller cost if the smaller input were on the left.
> 
> Phoenix actually builds the right and probes the left.
> 
>>> But we changed it, because the convention in the optimizer world is to
> build left-deep trees, with the largest input on the left, and smaller,
> hopefully selective, inputs on the right.
> 
> So I assume EnumerableJoin now should give LHS a cheaper cost, right? It
> does not look like so in the code.

Oops, you’re right. EnumerableJoin is more expensive if the larger input is placed on the
left. I think that is a mistake.

> Don't know if my understanding is correct, but I think a left-deep tree
> with largest relation on the left would most likely benefit nested loop
> joins. Phoenix is not able to do NL join, so either a left-deep tree with
> largest on the right or, if memory limit allows, a right-deep tree with
> largest on the left is preferable.

Although it would be nice if each join algorithm could choose its cost model, I think it would
make it a lot more complicated to build re-usable rules.

You should consider changing your join to match the convention. (And yes we need to change
EnumerableJoin also.)

Julian


Mime
View raw message