drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni" <...@maprtech.com>
Subject Review Request 16248: Fix merge join batch size, and it value vectors allocation; Add check to ensure a batch size is within a limit.
Date Fri, 13 Dec 2013 18:58:56 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for drill.

Repository: drill-git


In this patch, 3 changes are made:

1) In JoinTemplate.java
 we only advance outputPosition if the operation of copyLeft or copyRight is successful. Previously,
the code will advance outputPosition even in case of fails, leading to an IndexOutofBoundary
Exception in downstream batch processing.
2) In MergeJoinBatch.java
 we change the row count for the value vectors allocated to hold values from left and right,
making sure the row count is same for left and right, and bounded by a limit
3) In IteratorValidatorBatchIterator.java
 we add check to make sure every incoming batch's size is within a limit. Otherwise, stop
the execution immediately. This will help debug similar problem in the future, by turn on
assert with vm option -ea.

Another minor change is DrillPrepareImpl.java. 
The change is to add PushJoinThroughJoinRule into optimizer, since currently optiq 0.4.11
does not add this rule, and could end up with cartisian join when the join predicates are
in WHERE clause.  If we move to optiq 0.4.16, this change is no longer needed. For now, this
minor change would help run multiple table join queries.


  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinStatus.java 762cce7

  exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatch.java c13838c 
  sqlparser/src/main/java/org/apache/drill/optiq/DrillPrepareImpl.java af7bbc7 

Diff: https://reviews.apache.org/r/16248/diff/


1) The 4 table join query is successful.
2) Verify that the change in IteratorValidatorBatchIterator works as expected, by 1) reverse
the code fix in JoinTemplate.java 2) run the query, and the query stops at the first time
when the batch goes beyond the limit. 


Jinfeng Ni

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message