sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin BONNET (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-2607) Direct import from Netezza and encoding
Date Fri, 09 Oct 2015 08:39:26 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950077#comment-14950077
] 

Benjamin BONNET commented on SQOOP-2607:
----------------------------------------

Concerning new parameter use : it is an extended parameter 'encoding' that will accept a valid
encoding. Default value is UTF-8
Example : 
sqoop import --connect jdbc:netezza://host:port/base --username user --password password --direct
--table table --fields-terminated-by '|' --hive-import --create-hive-table --hive-table schema.table
-- --encoding=ISO-8859-15

> Direct import from Netezza and encoding
> ---------------------------------------
>
>                 Key: SQOOP-2607
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2607
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors
>    Affects Versions: 1.4.6
>            Reporter: Benjamin BONNET
>         Attachments: 0001-Add-a-table-encoding-parameter-for-Netezza-direct-im.patch
>
>
> Hi,
> I encountered an encoding issue while importing a Netezza table containing ISO-8859-15
encoded VARCHAR. Using direct mode, non ASCII chars are corrupted. That does not occur using
non-direct mode.
> Actually, direct mode uses a Netezza "external table", i.e. it flushes the table into
a stream using "internal" encoding (in my case, it is ISO-8859-15).
> But Sqoop import mapper reads this stream as an UTF-8 one.
> That problem does not occur using non direct mode since it uses Netezza JDBC driver to
map fields directly to Java types (no stream encoding involved).
> To have that issue fixed in my environment, I modified sqood netezza connector and added
a parameter to specify netezza varchar encoding. Default value will be UTF-8 of course. I
will make a pull request on github to propose that enhancement.
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message