kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3934) sqoop import param '--null-string' result in null value become blank string in hive table
Date Tue, 09 Apr 2019 04:01:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812995#comment-16812995
] 

ASF GitHub Bot commented on KYLIN-3934:
---------------------------------------

freewh commented on pull request #587:  KYLIN-3934 add config for sqoop config null-string
and null-non-string
URL: https://github.com/apache/kylin/pull/587
 
 
   add config for sqoop config null-string and null-non-string
   fix build error with adding source version and target version in scala-maven-plugin
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> sqoop import param '--null-string' result in null value become blank string in hive table
> -----------------------------------------------------------------------------------------
>
>                 Key: KYLIN-3934
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3934
>             Project: Kylin
>          Issue Type: Bug
>          Components: Others
>    Affects Versions: v2.6.0
>            Reporter: wanghao
>            Priority: Major
>             Fix For: v2.6.2
>
>
> when column value from jdbc is null, sqoop will write it into hive table with blank string.
> eg 
> jdbc:
> A | B
> 1 | 1
> 2 | 2
> a | null
>  
> hive table:
> A | B
> 1 | 1
> 2 | 2
> a |
> because of this, when I use count(distinct B), it return 3, not 2, and it can lead to
other problems
>  
>  
> {code:java}
> String cmd = String.format(Locale.ROOT,
> "%s/bin/sqoop import" + generateSqoopConfigArgString()
> + "--connect \"%s\" --driver %s --username %s --password %s --query \"%s AND \\$CONDITIONS\"
"
> + "--target-dir %s/%s --split-by %s --boundary-query \"%s\" --null-string '' "
> + "--fields-terminated-by '%s' --num-mappers %d",
> sqoopHome, connectionUrl, driverClass, jdbcUser, jdbcPass, selectSql, jobWorkingDir,
hiveTable,
> splitColumn, bquery, filedDelimiter, mapperNum);
> {code}
> the param '–null=string' should be '
> \\N' instead of blank string ''.
> I resolved this problem by replace the param. But it needs be configured in kylin.properties
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message