sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abraham Elmahrek <...@cloudera.com>
Subject Re: Comparing sqoop's output to hdfs to data in mysql
Date Wed, 17 Sep 2014 19:18:48 GMT
Try making your query a free-form query and casting the float to a string.
Something like the following worked well for me:

SELECT id, text, fl, CAST(fl AS CHAR(64)) FROM fl WHERE $CONDITIONS

On Tue, Sep 16, 2014 at 2:41 PM, pratik khadloya <tispratik@gmail.com>
wrote:

> Sure. The column in question is
>
> Field Type Null Default
> usage_fee_percent double YES 0
>
> If we run a mysql select query the value is 0
> If we cat the file on hdfs which is exported by sqoop the value is 0.0
>
> ======= Mysql Command =======
> mysql -u <user> -p<pwd> -h <host> <db> --raw -e "SELECT
> a.usage_fee_percent FROM accounts a" >
> /home/pkhadloya/sqoop_out/mysql_accounts
>
> ======= Sqoop Command =======
> bin/sqoop import -jt <jobtracker> --connect jdbc:mysql://.../<db>
> --username <user> --password <pwd> --target-dir
> /user/pkhadloya/sqoop/accounts --delete-target-dir --query "SELECT
> a.usage_fee_percent FROM accounts a WHERE \$CONDITIONS" --num-mappers 1
> --mapreduce-job-name accounts_sqoop_import --fields-terminated-by "\t"
>  --as-textfile
>
> ======= Diff Command =======
> bash -c "diff -U 0 <(tail -n +2 /home/pkhadloya/sqoop_out/mysql_accounts)
> <(hadoop fs -cat /user/pkhadloya/sqoop/accounts/part-m-00*) >
> /home/pkhadloya/sqoop_out/diff_accounts"
>
>
> Thanks for looking into this.
>
> Regards,
> Pratik
>
>
> On Tue, Sep 16, 2014 at 2:11 PM, Abraham Elmahrek <abe@cloudera.com>
> wrote:
>
>> Hey there,
>>
>> Could you provide us with the table description (types) and the sqoop
>> command you are running?
>>
>> -Abe
>>
>>
>> On Tue, Sep 16, 2014 at 11:19 AM, pratik khadloya <tispratik@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> I am comparing the mysql data (by dumping into a file) to the textfile
>>> imported by sqoop onto HDFS.
>>> Am using the diff tool to do the same.
>>>
>>> I observed the following differences:
>>> mysql      -->       sqoop_text_output
>>> \\n                       \n
>>> \\t                        \n
>>> \$                        $
>>> 0                         0.0
>>>
>>> So, it seems like mysql auto escapes the output with a \. I got around
>>> that by telling mysql not to do that so that i can compare properly. I had
>>> to pass the --raw flag to mysql. Then the only difference i currently see
>>> is that 0 being converted to 0.0 by sqoop (as mentioned in the docs).
>>>
>>> How can i make mysql also convert the 0 to a 0.0 when it dumps to a csv?
>>> Maybe the answer lines in the guts of sqoop that i can use myself. Or is it
>>> possible to tell sqoop not to convert 0 to 0.0 ?
>>>
>>> All in all, i am trying to verify the work done by sqoop for my
>>> satisfaction. Once i verify the text data is being exported fine, i will
>>> verify the same for the parquet format.
>>>
>>> Thanks,
>>> ~Pratik
>>>
>>
>>
>

Mime
View raw message