sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abraham Elmahrek" <...@cloudera.com>
Subject Re: Review Request: Fix for SQOOP-1013
Date Fri, 31 May 2013 20:30:38 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21237
-----------------------------------------------------------


Seems good except for a few comments and nit picks.

The datetime split logic is when certain intervals are "hotter" than others. IE: 1000000 rows
out of 20000000 exist between the date range of december 1st to december 31st, but a user
is importing the entire years data, with 12 node. Basically, 1 machine will extract 1000000
rows, while the others will extract 1000000/11 rows. In the future we could probably add some
upfront analysis of the data to improve distribution.


connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44096>

    Nit: This loop can probably handle all partitions by using constructDateConditions(objLB,
objUB, false) to constructDateConditions(objLB, objUB, upperBound == maxDateValue). Also,
upperBound += (i < remainder) ? 1 : 0;



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44094>

    Nit: This seems like it can be merged above.



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44099>

    There doesn't seem to be any thing stopping partitionMaxValue from being null. So, it
makes sense to add some null checking here as well.



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44098>

    There doesn't seem to be any thing stopping partitionMaxValue from being null. So, it
makes sense to add some null checking here as well.


- Abraham Elmahrek


On May 30, 2013, 6:25 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated May 30, 2013, 6:25 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java
ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message