sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abraham Elmahrek <...@cloudera.com>
Subject Re: Handling Special Character while Sqoop Import
Date Sat, 22 Nov 2014 19:10:48 GMT
This could be in 2 places: Loading to HDFS, or extracting from MySQL. Sqoop
should load every thing as UTF-8 by default, which supports Hindi.

What is your default character set in MySQL? Could you copy/paste your
my.cnf? Also, what version of MySQL are you running?

On Sat, Nov 22, 2014 at 12:28 AM, Vineet Mishra <clearmidoubt@gmail.com>
wrote:

> Hi Abe,
>
> Well with the above statement I mean to say that the data which is
> residing in mysql is different from what is been imported via sqoop.
>
> So let me shoot out an example for the same,
>
> *Data in mysql : *सुरेन्द्र कुमार पाण्डेय
> *Data in HDFS(Sqoop import) : * M-`M-$M-8M-`M-%M-
>
> So this is the kind of changes I am landing into which is completely
> loosing the meaning of the data.
>
> Any help would be appreciated.
>
> Thanks again!
>
> On Sat, Nov 22, 2014 at 2:15 AM, Abraham Elmahrek <abe@cloudera.com>
> wrote:
>
>> Hey there,
>>
>> Could you explain what you mean by "losing its meaning"? It's possible
>> you may need to set the character set:
>> http://dev.mysql.com/doc/connector-j/en/connector-j-reference-charsets.html
>> .
>>
>> -Abe
>>
>> On Fri, Nov 21, 2014 at 5:57 AM, Vineet Mishra <clearmidoubt@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am doing a Sqoop import from mysql as source, recently I figured out
>>> that data imported through sqoop from mysql was having some special
>>> characters and even control character which was loosing its meaning while
>>> moved to sqoop data files.
>>>
>>> Looking out for a solution as how to handle this case of special
>>> character or if possible pruning the unwanted data out of my target dataset.
>>>
>>> Looking out for resolution at the earliest!
>>>
>>> Thanks!
>>>
>>
>>
>

Mime
View raw message