sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yulei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3261) Enable charset convert when importing
Date Sun, 26 Nov 2017 16:31:01 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266090#comment-16266090

Yulei Yang commented on SQOOP-3261:

Usage: sqoop import -D charset.from= -D charset.to=.  As for regular case, no need to set
these two parameters. If you want to convert WE8MSWIN1252 or US7ASCII to human readable Chinese,
set charset.from='ISO-8859-1', charset.to='GBK'

> Enable charset convert when importing
> -------------------------------------
>                 Key: SQOOP-3261
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3261
>             Project: Sqoop
>          Issue Type: New Feature
>          Components: codegen
>    Affects Versions: 1.4.6
>            Reporter: Yulei Yang
>         Attachments: sqoop-3261.patch
> Hi,
> I think someone may have the requirement to convert charset of data when importing them
from RMDBS。In my case, if I do nothing, a table which store some Chinese content in oracle
with charset WE8MSWIN1252 will be unreadable in hive. Yes I know some databases have the function
to archive this by setting charset in connection url, while some others don't have this function
or it's inconvenient to use.  I have a common way to do this, and we have use this solution
for several months in our company. Could someone please add me to the contributors list?

This message was sent by Atlassian JIRA

View raw message