sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daniel voros <daniel.vo...@gmail.com>
Subject Review Request 67699: Splitting on integer column can create more splits than necessary
Date Fri, 22 Jun 2018 09:40:07 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for Sqoop.

Bugs: SQOOP-3336

Repository: sqoop-trunk


Running an import with -m 2 will result in three splits if there are only three consecutive
integers in the table ({1, 2, 3}).

Work is (probably) spread more evenly between mappers this way, but ending up with more files
than expected could be an issue.

Split-limit can also result in more values than asked for in the last chunk (due to the closed
interval in the end).


  src/docs/user/import.txt 2d074f49 
  src/java/org/apache/sqoop/mapreduce/db/IntegerSplitter.java 22c18e25 
  src/test/org/apache/sqoop/mapreduce/db/TestIntegerSplitter.java b43fc41f 

Diff: https://reviews.apache.org/r/67699/diff/1/


Corrected some tests that were flawed before and added new tests for the above mentioned (-m
2) case.

ran normal UTs and thirdparties


daniel voros

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message