sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-513) Provide a way to override the default splitter
Date Tue, 03 Jul 2012 22:40:34 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406130#comment-13406130

Daniel Templeton commented on SQOOP-513:

It should also be possible to pass parameters to the custom splitter from the command line.
> Provide a way to override the default splitter
> ----------------------------------------------
>                 Key: SQOOP-513
>                 URL: https://issues.apache.org/jira/browse/SQOOP-513
>             Project: Sqoop
>          Issue Type: Improvement
>    Affects Versions: 1.4.1-incubating
>            Reporter: Cheolsoo Park
> when the number of mappers is greater than 1, Sqoop divides rows using simple queries
such as:
> {code}
> select x, y from foo where x > 10 and x <= 20.
> {code}
> The ranges are computed simply by (max - min) / # of mappers. This works fine if values
of the split-by column are distributed evenly; however, it doesn't work well with skewed distribution,
for example.
> The proposal is to provide a way so that the user can override the default splitter.
For example, the user should be able to write their own splitter class, pass the class name
via a command option, and use that splitter at runtime instead of the default splitter.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message