sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park" <cheol...@cloudera.com>
Subject Re: Review Request: SQOOP-603: Support small intervals in IntegerSplitter implementation
Date Thu, 20 Sep 2012 16:35:24 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7193/#review11735
-----------------------------------------------------------


Hi Jarcec,

What if min = 0, max = 1, and numSplits = 5?

Following the split() function,

splitSize = (1 - 0) / 5 = 0;
remainder = (1 - 0) % 5 = 1;

After the for loop,

splits = (0, 1)

Now (maxVal - minVal) <= numSplits is true as (1 - 0) <= 5,

so we add maxVal to splits.

splits = (0, 1, 1)

so we end up with splits as follows:

[0, 1)
[1, 1) => redundant split that includes no values
[1, 1]

This case can happen if the user sets -m to a unnecessarily large number, can't it?

Please correct me if I am wrong.

Thanks!

- Cheolsoo Park


On Sept. 20, 2012, 3:37 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7193/
> -----------------------------------------------------------
> 
> (Updated Sept. 20, 2012, 3:37 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> I've decided to alter method split() to add one maxVal in case that there is less or
equal split points then requested split count.
> 
> 
> This addresses bug SQOOP-603.
>     https://issues.apache.org/jira/browse/SQOOP-603
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/db/DataDrivenDBInputFormat.java 35b74eb 
>   src/java/org/apache/sqoop/mapreduce/db/IntegerSplitter.java 8e7a096 
>   src/test/org/apache/sqoop/mapreduce/db/TestIntegerSplitter.java 22d5140 
> 
> Diff: https://reviews.apache.org/r/7193/diff/
> 
> 
> Testing
> -------
> 
> * ant test
> * Real MySQL instance in couple of scenarios
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message