calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stamatis Zampetakis <zabe...@gmail.com>
Subject Re: Optimization of join between a small table and a large table
Date Thu, 08 Aug 2019 06:19:28 GMT
Hi Gabriel,

What you want indeed seems to be a nested loop join; there has been a
relevant discussion in the dev list [1].
You may also find relevant the ongoing discussion on CALCITE-2979 [2].

Best,
Stamatis

[1]
https://lists.apache.org/thread.html/d9f95683e66009872a53e7e617295158b98746b550d2bf68230b3096@%3Cdev.calcite.apache.org%3E

[2] https://issues.apache.org/jira/browse/CALCITE-2979

On Wed, Aug 7, 2019 at 2:56 PM Gabriel Reid <gabriel.reid@gmail.com> wrote:

> Hi,
>
> I'm currently working on a custom Calcite adapter, and I've got a situation
> where I want to join a small table with a large table, basically just using
> the small table to filter the large table. Conceptually, I'm talking about
> something that is equivalent to the following query:
>
>     select emp.name
>     from emp join dept on emp.deptid = dept.id
>         where dept.name = 'Sales'
>
> I've got converter rules which to push down filtering on all tables, which
> make a big difference in performance. The above query current results in an
> EnumerableHashJoin over a filtered scan over the 'dept' table, and an
> unfiltered scan over the 'emp' table.
>
> What I would like to accomplish is that this is converted into a (I think)
> a nested loop join between 'dept' and emp, so that the filtered scan is
> done once over 'dept', and then a filtered scan is done for each entry of
> the 'dept' table using the 'id' value from that entry as a filter value on
> the scan on the 'emp' table.
>
> I've been able to get a nested loop join to be used, but I haven't managed
> to have the 'id' values from the 'dept' table to be used to filter the
> 'emp' table. Instead, the full 'emp' table is scanned for each iteration of
> the 'dept' table.
>
> And now my questions: are my hopes/expectations here realistic, and/or is
> there a much better/easier way of accomplishing the same thing? Is there
> prior art somewhere within the Calcite code base or elsewhere? Or should
> this just be working by default?
>
> I would assume that this isn't that unusual of a situation, which is why I
> was expecting that there would already be something like this somewhere (or
> I'm doing the wrong thing), but I haven't managed to find any clear
> pointers in any one direction yet.
>
> Thanks in advance for any advice!
>
> - Gabriel
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message