sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daniel voros <daniel.vo...@gmail.com>
Subject Review Request 67699: Splitting on integer column can create more splits than necessary
Date Fri, 22 Jun 2018 09:40:07 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67699/
-----------------------------------------------------------

Review request for Sqoop.


Bugs: SQOOP-3336
    https://issues.apache.org/jira/browse/SQOOP-3336


Repository: sqoop-trunk


Description
-------

Running an import with -m 2 will result in three splits if there are only three consecutive
integers in the table ({1, 2, 3}).

Work is (probably) spread more evenly between mappers this way, but ending up with more files
than expected could be an issue.

Split-limit can also result in more values than asked for in the last chunk (due to the closed
interval in the end).


Diffs
-----

  src/docs/user/import.txt 2d074f49 
  src/java/org/apache/sqoop/mapreduce/db/IntegerSplitter.java 22c18e25 
  src/test/org/apache/sqoop/mapreduce/db/TestIntegerSplitter.java b43fc41f 


Diff: https://reviews.apache.org/r/67699/diff/1/


Testing
-------

Corrected some tests that were flawed before and added new tests for the above mentioned (-m
2) case.

ran normal UTs and thirdparties


Thanks,

daniel voros


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message