drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3117) Wrong Join-Order when In-List is materialized as a table
Date Sat, 16 May 2015 04:21:00 GMT
Sean Hsuan-Yi Chu created DRILL-3117:
----------------------------------------

             Summary: Wrong Join-Order when In-List is materialized as a table
                 Key: DRILL-3117
                 URL: https://issues.apache.org/jira/browse/DRILL-3117
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
            Reporter: Sean Hsuan-Yi Chu
            Assignee: Sean Hsuan-Yi Chu


After the number of elements in In-List exceeds a threshold (set as 20 by DRILL-3009), Drill
materializes In-List into a table as an alternative to performing lots of comparisons for
each row. For instance, assuming `c.json` has lots of rows (> 100,000) and we have this
query:

select a.col from `c.json` a, `c.json` b
where a.col = b.col and a.id in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21);

Currently, Calcite generates a plan which performs JOIN on tables a & b firstly. However,
this is an extremely expensive operation. Instead, Drill should have JOINed the materialized
table with table a. 

This issue is also the root reason for the slow query reported in DRILL-2929.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message