sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tushar Sudake <etusha...@gmail.com>
Subject Re: Sqoop export using non-new line row delimiter
Date Tue, 22 May 2012 07:16:41 GMT
Hi,

Found that Cloudera Sqoop has open issue which didn't allow another record
delimiter than 'n'.
https://issues.cloudera.org/browse/SQOOP-136

Could anyone please confirm that this issue has been moved to/fixed in
Apache Sqoop after incubation?

Thanks,
Tushar

On Thu, May 17, 2012 at 5:29 PM, Tushar Sudake <etushar89@gmail.com> wrote:

> Hi,
>
> I have some data in text file on HDFS and want to export this data into
> MySQL database.
> But I want Sqoop to use "|" as record delimiter instead of default "\n"
> record delimiter.
>
> So I am specifying ' --input-lines-terminated-by "|" ' option in my Sqoop
> export command.
>
> The export succeeds, but the number of records exported shown is  only 1.
> And when I check in MySQL target table, I see only one record.
>
> Looks like only one record before first "|" is getting exported.
>
> Here's sample data on HDFS:
>
> 1,Hello|2,How|3,Are|4,You|5,I|6,am|7,fine|
>
> Sqoop Export command:
>
> bin/sqoop export --connect 'jdbc:mysql://localhost/mydb' -password pwd
> --username usr --table mytable --export-dir data
> --input-lines-terminated-by "|"
>
> Console logs:
>
> 12/05/17 03:32:02 WARN tool.BaseSqoopTool: Setting your password on the
> command-line is insecure. Consider using -P instead.
> 12/05/17 03:32:02 INFO manager.MySQLManager: Preparing to use a MySQL
> streaming resultset.
> 12/05/17 03:32:02 INFO tool.CodeGenTool: Beginning code generation
> 12/05/17 03:32:02 INFO manager.SqlManager: Executing SQL statement: SELECT
> t.* FROM `mytable` AS t LIMIT 1
> 12/05/17 03:32:03 INFO orm.CompilationManager: HADOOP_HOME is
> /home/tushar/hadoop-0.20.2-cdh3u4
> Note:
> /tmp/sqoop-tushar/compile/fd6d3bfd4c2ed7f2e19a2de418993dfc/mytable.java
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 12/05/17 03:32:04 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqooptushar/compile/fd6d3bfd4c2ed7f2e19a2de418993dfc/mytable.jar
> 12/05/17 03:32:04 INFO mapreduce.ExportJobBase: Beginning export of mytable
> 12/05/17 03:32:06 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/05/17 03:32:06 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/05/17 03:32:06 INFO mapred.JobClient: Running job: job_201205110542_0432
> 12/05/17 03:32:07 INFO mapred.JobClient:  map 0% reduce 0%
> 12/05/17 03:32:13 INFO mapred.JobClient:  map 100% reduce 0%
> 12/05/17 03:32:14 INFO mapred.JobClient: Job complete:
> job_201205110542_0432
> 12/05/17 03:32:14 INFO mapred.JobClient: Counters: 16
> 12/05/17 03:32:14 INFO mapred.JobClient:   Job Counters
> 12/05/17 03:32:14 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=6685
> 12/05/17 03:32:14 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 12/05/17 03:32:14 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 12/05/17 03:32:14 INFO mapred.JobClient:     Launched map tasks=1
> 12/05/17 03:32:14 INFO mapred.JobClient:     Data-local map tasks=1
> 12/05/17 03:32:14 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 12/05/17 03:32:14 INFO mapred.JobClient:   FileSystemCounters
> 12/05/17 03:32:14 INFO mapred.JobClient:     HDFS_BYTES_READ=166
> 12/05/17 03:32:14 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=79082
> 12/05/17 03:32:14 INFO mapred.JobClient:   Map-Reduce Framework
> 12/05/17 03:32:14 INFO mapred.JobClient:     Map input records=1
> 12/05/17 03:32:14 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=68677632
> 12/05/17 03:32:14 INFO mapred.JobClient:     Spilled Records=0
> 12/05/17 03:32:14 INFO mapred.JobClient:     CPU time spent (ms)=1130
> 12/05/17 03:32:14 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=39911424
> 12/05/17 03:32:14 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=392290304
> 12/05/17 03:32:14 INFO mapred.JobClient:     Map output records=1
> 12/05/17 03:32:14 INFO mapred.JobClient:     SPLIT_RAW_BYTES=117
> 12/05/17 03:32:14 INFO mapreduce.ExportJobBase: Transferred 166 bytes in
> 9.6013 seconds (17.2893 bytes/sec)
> 12/05/17 03:32:14 INFO mapreduce.ExportJobBase: Exported 1 records.
>
> On MySQL side:
>
> mysql> select * from mytable;
> +------+-------+
> | i    | name  |
> +------+-------+
> |    1 | Hello |
> +------+-------+
> 1 row in set (0.00 sec)
>
> Sqoop version is: sqoop-1.4.1-incubating__hadoop-1.0.0
> Hadoop Version: CDH3u4
>
> Doen't Sqoop support any other record delimiter than "\n" or am I missing
> something?
> Please suggest solution for this.
>
> Thanks,
> Tushar
>

Mime
View raw message