sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikash Talanki -X (vtalanki - INFOSYS LIMITED at Cisco)" <vtala...@cisco.com>
Subject Convert new line chars from oracle to hive using sqoop
Date Mon, 22 Sep 2014 18:27:37 GMT
Hi All,

We are using '<EOL>' string( --hive-delims-replacement '<EOL>') to convert new
lines chars in oracle fields while importing data into hive using sqoop.
According to sqoop documentation - http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_large_objects
- above parameter should only replace either \n, \r or \01(^A) characters with '<EOL>'.
But we seeing that some special characters are also getting replaced to '<EOL>'

Our scenario:
Oracle Field

Hive Field

Notepad ++




[Screen capture]




[Screen capture]


But, some character in above sample which is NOT visible in Oracle is being shown up as 'SOH'
in notepad++ and as '_' in word which is being converted into <EOL> by sqoop.
Please help us understand this behavior.
What does these chars mean to sqoop/hive?
Is sqoop expected to replace these chars which doesn't fall under either \n, \r or \01(^A)

Vikash Talanki
Engineer - Software
Phone: +1 (408)838 4078

Cisco Systems Limited
SJ-J 3
255 W Tasman Dr
San Jose
CA - 95134
United States

[Think before you print.]Think before you print.

This email may contain confidential and privileged material for the sole use of the intended
recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If
you are not the intended recipient (or authorized to receive for the recipient), please contact
the sender by reply email and delete all copies of this message.
For corporate legal information go to:

View raw message