sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: Sqoop - utf-8 data load issue
Date Tue, 16 Jul 2013 18:05:27 GMT
Thank you for the additional information Varun! Would you mind doing something like the following:

 hadoop dfs -text THE_FILE  | hexdump -C

And sharing the output? I'm trying to see the actual content of the file rather than any interpreted
value.

Jarcec

On Mon, Jul 15, 2013 at 06:52:11PM -0700, varun kumar gullipalli wrote:
> Hi Jarcec,
> 
> I am validating the data by running the following command,
> 
> hadoop fs -text <hdfs cluster>
> 
> I think there is no issue with the shell (correct me if am wrong) because I am connecting
to MySQL database from the same shell(command line) and  could view the source data properly.
> 
> Initially we observed that the following conf files doesn't have utf-8 encoding. 
> <?xml version="1.0" encoding="UTF-8"?>
> 
> sqoop-site.xml
> sqoop=site-template.xml
> 
> But no luck after making the changes too.
> 
> Thanks,
> Varun
> 
> 
> ________________________________
>  From: Jarek Jarcec Cecho <jarcec@apache.org>
> To: user@sqoop.apache.org; varun kumar gullipalli <varunkumar_g@yahoo.com> 
> Sent: Monday, July 15, 2013 6:37 PM
> Subject: Re: Sqoop - utf-8 data load issue
>  
> 
> Hi Varun,
> we are usually not seeing any issues with transferring text data in UTF. How are
> you validating the imported file? I can imagine that your shell might be messing
> the encoding.
> 
> Jarcec
> 
> On Mon, Jul 15, 2013 at 06:27:25PM -0700, varun kumar gullipalli wrote:
> > 
> > 
> > Hi,
> > I am importing data from MySql to HDFS using free-form query import.
> > It works fine but facing issue when the data is utf-8.The source(MySql) db is utf-8
compatible but looks like sqoop is converting the data during import.
> > Example - The source value - elémeñt is loaded as elémeñt to HDFS.
> > Please provide a solution for this.
> > Thanks in advance!

Mime
View raw message