sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: Fwd: Sqoop export not working when using "update-key"
Date Fri, 16 May 2014 19:51:04 GMT
Hi Leonardo,
sadly the Sqoop output might not be that much helpful in this case, could you please share
with us the failed map task log?

The easiest way how to get it on Hadoop 1.x is to open the job tracker webinterface, find
the failed Sqoop job and navigate to the failed map tasks.

Jarcec

On Tue, May 13, 2014 at 11:36:34AM -0300, Leonardo Brambilla wrote:
> Hi Jarek, find below the full sqoop generated log. I went through all the
> Cluster's nodes for this task logs and there is nothing more than this same
> error. I really don't know what else to look for.
> 
> Thanks
> 
> 
> Warning: /usr/lib/hbase does not exist! HBase imports will fail.
> Please set $HBASE_HOME to the root of your HBase installation.
> 14/05/13 10:26:41 WARN tool.BaseSqoopTool: Setting your password on the
> command-line is insecure. Consider using -P instead.
> 14/05/13 10:26:41 INFO manager.SqlManager: Using default fetchSize of 1000
> 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been set to GMT
> 14/05/13 10:26:41 INFO tool.CodeGenTool: Beginning code generation
> 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been set to GMT
> 14/05/13 10:26:41 INFO manager.SqlManager: Executing SQL statement: SELECT
> t.* FROM etl.EXPT_SPAM_RED_JOB t WHERE 1=0
> 14/05/13 10:26:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
> /home/elance/hadoop
> Note:
> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl_EXPT_SPAM_RED_JOB.java
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 14/05/13 10:26:42 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl.EXPT_SPAM_RED_JOB.jar
> 14/05/13 10:26:42 INFO mapreduce.ExportJobBase: Beginning export of
> etl.EXPT_SPAM_RED_JOB
> 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to process
> : 1
> 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to process
> : 1
> 14/05/13 10:26:44 INFO mapred.JobClient: Running job: job_201404190827_0998
> 14/05/13 10:26:45 INFO mapred.JobClient:  map 0% reduce 0%
> 14/05/13 10:26:53 INFO mapred.JobClient:  map 25% reduce 0%
> 14/05/13 10:26:54 INFO mapred.JobClient:  map 75% reduce 0%
> 14/05/13 10:26:55 INFO mapred.JobClient: Task Id :
> attempt_201404190827_0998_m_000001_0, Status : FAILED
> java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter at
> index:: 4
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>         at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.sql.SQLException: Missing IN or OUT parameter at index:: 4
>         at
> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>         at
> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>         at
> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>         at
> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>         ... 8 more
> 
> 14/05/13 10:27:00 INFO mapred.JobClient: Task Id :
> attempt_201404190827_0998_m_000001_1, Status : FAILED
> java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter at
> index:: 4
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>         at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.sql.SQLException: Missing IN or OUT parameter at index:: 4
>         at
> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>         at
> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>         at
> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>         at
> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>         ... 8 more
> 
> 14/05/13 10:27:05 INFO mapred.JobClient: Task Id :
> attempt_201404190827_0998_m_000001_2, Status : FAILED
> java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter at
> index:: 4
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>         at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.sql.SQLException: Missing IN or OUT parameter at index:: 4
>         at
> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>         at
> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>         at
> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>         at
> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>         at
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>         ... 8 more
> 
> 14/05/13 10:27:13 INFO mapred.JobClient: Job complete: job_201404190827_0998
> 14/05/13 10:27:13 INFO mapred.JobClient: Counters: 20
> 14/05/13 10:27:13 INFO mapred.JobClient:   Job Counters
> 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=30548
> 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/05/13 10:27:13 INFO mapred.JobClient:     Rack-local map tasks=5
> 14/05/13 10:27:13 INFO mapred.JobClient:     Launched map tasks=7
> 14/05/13 10:27:13 INFO mapred.JobClient:     Data-local map tasks=2
> 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 14/05/13 10:27:13 INFO mapred.JobClient:     Failed map tasks=1
> 14/05/13 10:27:13 INFO mapred.JobClient:   File Output Format Counters
> 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Written=0
> 14/05/13 10:27:13 INFO mapred.JobClient:   FileSystemCounters
> 14/05/13 10:27:13 INFO mapred.JobClient:     HDFS_BYTES_READ=459
> 14/05/13 10:27:13 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=189077
> 14/05/13 10:27:13 INFO mapred.JobClient:   File Input Format Counters
> 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Read=0
> 14/05/13 10:27:13 INFO mapred.JobClient:   Map-Reduce Framework
> 14/05/13 10:27:13 INFO mapred.JobClient:     Map input records=0
> 14/05/13 10:27:13 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=363053056
> 14/05/13 10:27:13 INFO mapred.JobClient:     Spilled Records=0
> 14/05/13 10:27:13 INFO mapred.JobClient:     CPU time spent (ms)=2110
> 14/05/13 10:27:13 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=553517056
> 14/05/13 10:27:13 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=2344087552
> 14/05/13 10:27:13 INFO mapred.JobClient:     Map output records=0
> 14/05/13 10:27:13 INFO mapred.JobClient:     SPLIT_RAW_BYTES=404
> 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Transferred 459 bytes in
> 30.0642 seconds (15.2673 bytes/sec)
> 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Exported 0 records.
> 14/05/13 10:27:13 ERROR tool.ExportTool: Error during export: Export job
> failed!
> 
> 
> 
> On Mon, May 12, 2014 at 10:44 PM, Jarek Jarcec Cecho <jarcec@apache.org>wrote:
> 
> > The map task log contain entire executed query and lot of additional
> > information and hence it's supper useful in such cases.
> >
> > Jarcec
> >
> > On Mon, May 12, 2014 at 02:59:56PM -0300, Leonardo Brambilla wrote:
> > > Hi Jarek,
> > >
> > > thanks for replying, I don't have the logs. I'll see if I can run the
> > task
> > > again and then keep the logs.
> > >
> > > Anyway, I don't remember seeing anything else than this SQLException
> > about
> > > missing parameter.
> > >
> > > Leo
> > >
> > >
> > > On Sun, May 11, 2014 at 10:59 AM, Jarek Jarcec Cecho <jarcec@apache.org
> > >wrote:
> > >
> > > > Hi Leonardo,
> > > > would you mind sharing with us task log from the failed map task?
> > > >
> > > > Jarcec
> > > >
> > > > On Sun, May 11, 2014 at 10:33:11AM -0300, Leonardo Brambilla wrote:
> > > > > Hello, I am struggling to make it work, what is a really required
> > > > feature.
> > > > >
> > > > > I have a process that daily generates new data, this data needs to
be
> > > > > pushed to a table in Oracle, the table might already have same data
> > from
> > > > > previous loads. I need to avoid duplicating data on it. Pretty common
> > > > > scenario right? =)
> > > > >
> > > > > I am using sqoop export for this, no special arguments, just columns,
> > > > > fields-terminated-by, table and db connection, plus the argument
> > > > > "update-mode allowinsert".
> > > > >
> > > > > Now, when I also include the argument "update-key" with a comma
> > separated
> > > > > list of fields (which is the same for arg columns) I get the
> > following
> > > > > oracle driver error:
> > > > >
> > > > > 14/05/07 16:00:03 INFO mapred.JobClient: Task Id :
> > > > > attempt_201404190827_0928_m_000003_2, Status : FAILED
> > > > > java.io.IOException: Can't export data, please check task tracker
> > logs
> > > > >         at
> > > > >
> > > >
> > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
> > > > >         at
> > > > >
> > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
> > > > >         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > > > >         at
> > > > >
> > > >
> > org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> > > > >         at
> > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> > > > >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > > > >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > > > >         at java.security.AccessController.doPrivileged(Native Method)
> > > > >         at javax.security.auth.Subject.doAs(Subject.java:415)
> > > > >         at
> > > > >
> > > >
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> > > > >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > > > > Caused by: java.io.IOException: java.sql.SQLException: Missing IN
or
> > OUT
> > > > > parameter at index:: 4
> > > > >         at
> > > > >
> > > >
> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
> > > > >         at
> > > > >
> > > >
> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
> > > > >         at
> > > > >
> > > >
> > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
> > > > >         at
> > > > >
> > > >
> > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> > > > >         at
> > > > >
> > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
> > > > >         ... 10 more
> > > > > *Caused by: java.sql.SQLException: Missing IN or OUT parameter at
> > > > index:: 4*
> > > > >         at
> > > > >
> > > >
> > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
> > > > >         at
> > > > >
> > > >
> > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
> > > > >         at
> > > > >
> > > >
> > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
> > > > >         at
> > > > >
> > > >
> > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
> > > > >         at
> > > > >
> > > >
> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
> > > > >         at
> > > > >
> > > >
> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:215)
> > > > >         ... 14 more
> > > > >
> > > > > I'm using Sqoop 1.4.3 with hadoop1, also tried 1.4.4 with same
> > result. I
> > > > > have the standard Oracle JDBC driver 6 with Java 7.
> > > > > I went through all the documentation, Sqoop user guide says this
is
> > > > > supported for built-in connector which I understand I am using.
> > > > >
> > > > > Here is the full command:
> > > > > $sqoopExecutable export \
> > > > > --outdir $outdir \
> > > > > --connect $connectionString --table $table_client --username $dbUser
> > > > > --password $dbUserPasswd \
> > > > > --columns CLIENT_ID,EXP_ID,BUCKET_ID --update-key
> > > > > CLIENT_ID,EXP_ID,BUCKET_ID \
> > > > > --fields-terminated-by '\t' --update-mode allowinsert \
> > > > > --export-dir $dataSource_client > $sqoopLog 2>&1
> > > > >
> > > > > Can someone please shed some light on this?
> > > > > Thank you in advance.
> > > > >
> > > > > Leo
> > > >
> >

Mime
View raw message