sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glen Hein <glenh...@gmail.com>
Subject having trouble with --hive-delims-replacement
Date Tue, 10 Mar 2015 16:34:47 GMT
I have an oracle db were some of the records contain \n and \r characters. I
m trying to use --hive-delims-replacement to convert them to spaces. I've
had mixed results. It seems that when the \n and \r appear at the beginning
of the fields data, then they are converted as expected. But when they are
embedded in the middle of the field data, then they are not being replaced.
Here's a hexdump showing the unconverted \n and \r characters:

000000c0  6e 75 6c 6c 01 48 65 6c  6c 6f 2c 0d 0a 20 20 20
|null.Hello,..   |
000000d0  20 20 20 4d 79 20 6e 61  6d 65 20 69 73 20 4d 61  |   My name is
Ma|
000000e0  72 79 20 48 75 6e 74 2c  20 49 20 61 6d 20 31 39  |ry Hunt, I am
19|
000000f0  20 79 65 61 72 73 20 6f  6c 64 20 61 6e 64 20 6c  | years old and
l|
00000100  69 76 65 20 69 6e 20 43  61 6c 69 66 6f 72 6e 69  |ive in
Californi|
00000110  61 2e 20 49 20 67 72 61  64 75 61 74 65 64 20 68  |a. I graduated
h|
00000120  69 67 68 20 73 63 68 6f  6f 6c 20 6f 6e 20 32 30  |igh school on
20|
00000130  31 31 20 61 6e 64 20 64  65 63 69 64 65 64 20 74  |11 and decided
t|
00000140  6f 20 74 61 6b 65 20 61  20 79 65 61 72 20 6f 66  |o take a year
of|
00000150  66 20 74 6f 20 73 65 65  20 69 66 20 77 68 61 74  |f to see if
what|
00000160  20 49 20 63 68 6f 73 65  20 61 73 20 61 20 66 75  | I chose as a
fu|
00000170  74 75 72 65 20 63 61 72  65 65 72 20 69 73 20 77  |ture career is
w|
00000180  68 61 74 20 49 20 72 65  61 6c 79 20 77 61 6e 74  |hat I realy
want|
00000190  65 64 20 74 6f 20 64 6f  2e 20 4d 79 20 79 65 61  |ed to do. My
yea|
000001a0  72 20 69 73 20 75 70 20  61 6e 64 20 49 20 73 74  |r is up and I
st|
000001b0  69 6c 6c 20 77 61 6e 74  20 74 6f 20 62 65 20 61  |ill want to be
a|
000001c0  20 74 65 61 63 68 65 72  2e 20 49 20 6c 6f 6f 6b  | teacher. I
look|
000001d0  20 66 6f 77 61 72 64 20  74 6f 20 6c 65 61 72 6e  | foward to
learn|
000001e0  69 6e 67 20 61 6c 6c 20  74 68 61 74 20 49 20 63  |ing all that I
c|
000001f0  61 6e 20 61 6e 64 20 62  65 69 6e 67 20 61 6e 20  |an and being
an |
00000200  65 6c 65 6d 65 6e 74 72  79 20 73 63 68 6f 6f 6c  |elementry
school|
00000210  20 74 65 61 63 68 65 72  2e 01 6e 75 6c 6c 01 31  |
teacher..null.1|

You can see in the first row of data there is a "01" that starts the field,
and then a few characters later the 0d and 0a. I'm passing the following
option to sqoop:

                <arg>--hive-delims-replacement</arg>
                <arg>ABCDEFG</arg>

I've tried several different values for the actual replacement string, but
the results are always the same.

Is there a better way to replace the unwanted characters?

Thanks,
Glen

Mime
View raw message