sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anna Szonyi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-2411) Sqoop using '--direct' option fails with mysqldump exit code 2 and 3
Date Mon, 14 Aug 2017 12:18:00 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125598#comment-16125598
] 

Anna Szonyi commented on SQOOP-2411:
------------------------------------

Hi [~sanysandish@gmail.com],

Thanks for following up on this jira!

In general we should only close these types of jiras if we know that we can't solve the issue
from the Sqoop side/it's an expected failure/not a problem. However it might be a question
around the cause of the exception: is the logging sufficient for the end user to tell what
the root cause was, etc. Also it's a question of whether increasing 'net-write-timeout' or
'net-read-timeout' should solve these and if it's just a question of increasing it further
(to how much) or if we're not passing it correctly (it's a bug on our end) or it doesn't have
the desired effect (maybe a doc update).

In general if you could reproduce the issue, and think it's solvable, this could be an improvement
to potentially improve logging or solve the time out issues/check whether the net-read-timeout
increasing helps (or a doc jira about usage).

Thanks,
Anna

> Sqoop using '--direct' option fails with mysqldump exit code 2 and 3
> --------------------------------------------------------------------
>
>                 Key: SQOOP-2411
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2411
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/mysql
>    Affects Versions: 1.4.6
>         Environment: Amazon EMR
>            Reporter: Karthick H
>            Assignee: Sandish Kumar HN
>            Priority: Critical
>
> I am running Sqoop in AWS EMR. I am trying to copy a table ~10 GB from MySQL into HDFS.
> I get the following exception
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : attempt_1435664372091_0048_m_000000_2,
Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 3
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : attempt_1435664372091_0048_m_000005_2,
Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 2
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:08 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/06 12:19:20 INFO mapreduce.Job:  map 25% reduce 0%
> 15/07/06 12:19:22 INFO mapreduce.Job:  map 38% reduce 0%
> 15/07/06 12:19:23 INFO mapreduce.Job:  map 50% reduce 0%
> 15/07/06 12:19:24 INFO mapreduce.Job:  map 75% reduce 0%
> 15/07/06 12:19:25 INFO mapreduce.Job:  map 100% reduce 0%
> 15/07/06 12:23:11 INFO mapreduce.Job: Job job_1435664372091_0048 failed with state FAILED
due to: Task failed task_1435664372091_0048_m_000000
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 15/07/06 12:23:11 INFO mapreduce.Job: Counters: 8
>         Job Counters 
>         Failed map tasks=28
>         Launched map tasks=28
>         Other local map tasks=28
>         Total time spent by all maps in occupied slots (ms)=34760760
>         Total time spent by all reduces in occupied slots (ms)=0
>         Total time spent by all map tasks (ms)=5793460
>         Total vcore-seconds taken by all map tasks=5793460
>         Total megabyte-seconds taken by all map tasks=8342582400
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use
org.apache.hadoop.mapreduce.FileSystemCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 829.8697 seconds
(0 bytes/sec)
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group   org.apache.hadoop.mapred.Task$Counter
is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Retrieved 0 records.
> 15/07/06 12:23:11 ERROR tool.ImportTool: Error during import: Import job failed!
> If I run with out '--direct' option, I get the communication exception as in https://issues.cloudera.org/browse/SQOOP-186
> I have set 'net-write-timeout' and 'net-read-timeout' values in MySQL to 6000.
> My Sqoop command looks like this
> sqoop import -D mapred.task.timeout=0 --fields-terminated-by '\t' --escaped-by '\\' --optionally-enclosed-by
'\"' --bindir ./ --connect jdbc:mysql://<remote ip>/<mysql db> --username tuser
--password tuser --table table1 --target-dir=/base/table1 --split-by id -m 8 --direct



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message