sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yulei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3261) Enable charset convert when importing
Date Sun, 26 Nov 2017 16:31:01 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266090#comment-16266090
] 

Yulei Yang commented on SQOOP-3261:
-----------------------------------

Usage: sqoop import -D charset.from= -D charset.to=.  As for regular case, no need to set
these two parameters. If you want to convert WE8MSWIN1252 or US7ASCII to human readable Chinese,
set charset.from='ISO-8859-1', charset.to='GBK'

> Enable charset convert when importing
> -------------------------------------
>
>                 Key: SQOOP-3261
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3261
>             Project: Sqoop
>          Issue Type: New Feature
>          Components: codegen
>    Affects Versions: 1.4.6
>            Reporter: Yulei Yang
>         Attachments: sqoop-3261.patch
>
>
> Hi,
> I think someone may have the requirement to convert charset of data when importing them
from RMDBS。In my case, if I do nothing, a table which store some Chinese content in oracle
with charset WE8MSWIN1252 will be unreadable in hive. Yes I know some databases have the function
to archive this by setting charset in connection url, while some others don't have this function
or it's inconvenient to use.  I have a common way to do this, and we have use this solution
for several months in our company. Could someone please add me to the contributors list?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message