sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key
Date Mon, 22 Aug 2016 15:25:21 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430986#comment-15430986
] 

ASF GitHub Bot commented on SQOOP-3002:
---------------------------------------

Github user kevin00chen commented on a diff in the pull request:

    https://github.com/apache/sqoop/pull/26#discussion_r75698508
  
    --- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java ---
    @@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c)
         }
         Object keyObj = null;
         if (keyColName.contains(",")) {
    +        String connectStr = new String(new byte[]{1});
             StringBuilder keyFieldsSb = new StringBuilder();
             for (String str : keyColName.split(",")) {
    -            keyFieldsSb.append("+").append(fieldMap.get(str).toString());
    +            keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString());
    --- End diff --
    
    for example one table has two column, a and b
    
    Field a | Field b
    ------------ | -------------
    a+ | b
    a | +b
    
    when use "+" to connect two field, two record will has same keyObj.
    To avoid this i use a String contains one byte.


> Sqoop Merge Tool support composite merge-key
> --------------------------------------------
>
>                 Key: SQOOP-3002
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3002
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: hive-integration
>    Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>            Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key arguement.

> But when my table has composite keys, i use --merge-key column1,column2 then i got an
Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : attempt_1470135750174_2508_m_000004_2,
Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a key column
that exists?
> 	at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
> 	at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
> 	at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message