drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Chen" <tnac...@gmail.com>
Subject Re: Review Request 16248: Fix merge join batch size, and it value vectors allocation; Add check to ensure a batch size is within a limit.
Date Mon, 16 Dec 2013 06:18:55 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16248/#review30435
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinTemplate.java
<https://reviews.apache.org/r/16248/#comment58234>

    I wonder if you can have at least one unit test for your fix?


- Timothy Chen


On Dec. 13, 2013, 6:58 p.m., Jinfeng Ni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16248/
> -----------------------------------------------------------
> 
> (Updated Dec. 13, 2013, 6:58 p.m.)
> 
> 
> Review request for drill.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> In this patch, 3 changes are made:
> 
> 1) In JoinTemplate.java
>  we only advance outputPosition if the operation of copyLeft or copyRight is successful.
Previously, the code will advance outputPosition even in case of fails, leading to an IndexOutofBoundary
Exception in downstream batch processing.
> 2) In MergeJoinBatch.java
>  we change the row count for the value vectors allocated to hold values from left and
right, making sure the row count is same for left and right, and bounded by a limit
> 3) In IteratorValidatorBatchIterator.java
>  we add check to make sure every incoming batch's size is within a limit. Otherwise,
stop the execution immediately. This will help debug similar problem in the future, by turn
on assert with vm option -ea.
> 
> Another minor change is DrillPrepareImpl.java. 
> The change is to add PushJoinThroughJoinRule into optimizer, since currently optiq 0.4.11
does not add this rule, and could end up with cartisian join when the join predicates are
in WHERE clause.  If we move to optiq 0.4.16, this change is no longer needed. For now, this
minor change would help run multiple table join queries.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinStatus.java
762cce7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinTemplate.java
aae1a3c 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/MergeJoinBatch.java
f3a32cd 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/validate/IteratorValidatorBatchIterator.java
d6f08f1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatch.java c13838c

>   sqlparser/src/main/java/org/apache/drill/optiq/DrillPrepareImpl.java af7bbc7 
> 
> Diff: https://reviews.apache.org/r/16248/diff/
> 
> 
> Testing
> -------
> 
> 1) The 4 table join query is successful.
> 2) Verify that the change in IteratorValidatorBatchIterator works as expected, by 1)
reverse the code fix in JoinTemplate.java 2) run the query, and the query stops at the first
time when the batch goes beyond the limit. 
> 
> 
> Thanks,
> 
> Jinfeng Ni
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message