drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zelaine Fong <zf...@maprtech.com>
Subject Re: Cartesian Product in Apache Drill
Date Tue, 27 Dec 2016 02:58:06 GMT
I'm not sure how widely nested loop joins outside of scalar subqueries have
been exercised by Drill users, since that setting is not the default.  Note
that nested loop joins can only be processed using broadcast joins [1].  So
you will incur a lot of network transfer overhead unless the smaller of the
tables you're joining is kept to a minimum.

[1] https://drill.apache.org/docs/join-planning-guidelines/

-- Zelaine

On Mon, Dec 26, 2016 at 7:05 AM, clhubert@gmail.com <clhubert@gmail.com>
wrote:

> Zelaine,
>
> I appreciate it...   That worked.
>
> I am thinking of turning on this feature system wide.
>
> Is there any foreseeable issue with using nested joins outside of scalar
> subqueries?  Performance or otherwise?
>
> Regards,
> CLN
>
>
> On Sun, Dec 25, 2016 at 7:22 PM, Zelaine Fong <zfong@maprtech.com> wrote:
>
>> Alternatively, you can set the following configuration to false:
>>
>> alter session set `planner.enable_nljoin_for_scalar_only` = false;
>>
>> Cartesian joins need to be processed as a nested loop join, and by
>> default, Drill only considers nested joins in the case where at least one
>> side of the join is a scalar subquery.
>>
>> -- Zelaine
>>
>> On Sun, Dec 25, 2016 at 2:46 PM, Ted Dunning <ted.dunning@gmail.com>
>> wrote:
>>
>>> You can fake the limitation by adding a constant column to both tables, I
>>> think, and then joining on the constant.
>>>
>>>
>>>
>>> On Sun, Dec 25, 2016 at 2:04 PM, clhubert@gmail.com <clhubert@gmail.com>
>>> wrote:
>>>
>>> >
>>> > I am trying to do a cross join to get a cartesian products.
>>> >
>>> > Per the error message (attached) and the JIRA ticket I see it isn't
>>> > supported.
>>> > https://issues.apache.org/jira/browse/DRILL-3807
>>> >
>>> > I wrote the query against using dfs on csv file types.
>>> >
>>> > Can I execute a cross join in Apache Drill just by moving my data to a
>>> > different file type or Storage Plugin. Such as Parquet,JSON, or RDBMS
>>> > Plugin.
>>> >
>>> > Regards,
>>> > CLN
>>> >
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message