sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leonardo Brambilla <lbrambi...@contractor.elance-odesk.com>
Subject Re: Fwd: Sqoop export not working when using "update-key"
Date Thu, 17 Jul 2014 14:44:12 GMT
Hello Jarek,

I'm getting back to this issue, I'm trying to fix it by using Oraoop but
that doesn't avoid the exception:
java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter at
index:: 4

I ran a couple of tests and I can tell that the following command works to
insert new rows:
*sqoop export \*
*--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
<http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*

*--columns $columns \*
*--table $table --username $user --password $passwd \*
*--fields-terminated-by "=" --export-dir $exportDir*

But the following command (just added --update-key) throws an exception:
*sqoop export \*
*--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
<http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*
*--update-key "SEARCH_DATE" \*
*--columns $columns \*
*--table $table --username $user --password $passwd \*
*--fields-terminated-by "=" --export-dir $exportDir*

DB is oracle 11.2.0.2.0
Sqoop is 1.4.4
Java 1.7
Oraoop 1.6
Oracle jdbc driver "ojdb6c.jar" implementation version 11.2.0.3.0

Like I said before, all the log I can get from the failed task I already
posted here.

Can you confirm that Sqoop export update works on Oracle DBs?
Thanks in advance
Leo



On Fri, May 16, 2014 at 4:51 PM, Jarek Jarcec Cecho <jarcec@apache.org>
wrote:

> Hi Leonardo,
> sadly the Sqoop output might not be that much helpful in this case, could
> you please share with us the failed map task log?
>
> The easiest way how to get it on Hadoop 1.x is to open the job tracker
> webinterface, find the failed Sqoop job and navigate to the failed map
> tasks.
>
> Jarcec
>
> On Tue, May 13, 2014 at 11:36:34AM -0300, Leonardo Brambilla wrote:
> > Hi Jarek, find below the full sqoop generated log. I went through all the
> > Cluster's nodes for this task logs and there is nothing more than this
> same
> > error. I really don't know what else to look for.
> >
> > Thanks
> >
> >
> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
> > Please set $HBASE_HOME to the root of your HBase installation.
> > 14/05/13 10:26:41 WARN tool.BaseSqoopTool: Setting your password on the
> > command-line is insecure. Consider using -P instead.
> > 14/05/13 10:26:41 INFO manager.SqlManager: Using default fetchSize of
> 1000
> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 14/05/13 10:26:41 INFO tool.CodeGenTool: Beginning code generation
> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 14/05/13 10:26:41 INFO manager.SqlManager: Executing SQL statement:
> SELECT
> > t.* FROM etl.EXPT_SPAM_RED_JOB t WHERE 1=0
> > 14/05/13 10:26:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
> > /home/elance/hadoop
> > Note:
> >
> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl_EXPT_SPAM_RED_JOB.java
> > uses or overrides a deprecated API.
> > Note: Recompile with -Xlint:deprecation for details.
> > 14/05/13 10:26:42 INFO orm.CompilationManager: Writing jar file:
> >
> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl.EXPT_SPAM_RED_JOB.jar
> > 14/05/13 10:26:42 INFO mapreduce.ExportJobBase: Beginning export of
> > etl.EXPT_SPAM_RED_JOB
> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to
> process
> > : 1
> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to
> process
> > : 1
> > 14/05/13 10:26:44 INFO mapred.JobClient: Running job:
> job_201404190827_0998
> > 14/05/13 10:26:45 INFO mapred.JobClient:  map 0% reduce 0%
> > 14/05/13 10:26:53 INFO mapred.JobClient:  map 25% reduce 0%
> > 14/05/13 10:26:54 INFO mapred.JobClient:  map 75% reduce 0%
> > 14/05/13 10:26:55 INFO mapred.JobClient: Task Id :
> > attempt_201404190827_0998_m_000001_0, Status : FAILED
> > java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
> at
> > index:: 4
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
> >         at
> >
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at index::
> 4
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
> >         at
> >
> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
> >         ... 8 more
> >
> > 14/05/13 10:27:00 INFO mapred.JobClient: Task Id :
> > attempt_201404190827_0998_m_000001_1, Status : FAILED
> > java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
> at
> > index:: 4
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
> >         at
> >
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at index::
> 4
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
> >         at
> >
> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
> >         ... 8 more
> >
> > 14/05/13 10:27:05 INFO mapred.JobClient: Task Id :
> > attempt_201404190827_0998_m_000001_2, Status : FAILED
> > java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
> at
> > index:: 4
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
> >         at
> >
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at index::
> 4
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
> >         at
> >
> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
> >         at
> >
> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
> >         at
> >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
> >         ... 8 more
> >
> > 14/05/13 10:27:13 INFO mapred.JobClient: Job complete:
> job_201404190827_0998
> > 14/05/13 10:27:13 INFO mapred.JobClient: Counters: 20
> > 14/05/13 10:27:13 INFO mapred.JobClient:   Job Counters
> > 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=30548
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by all maps
> > waiting after reserving slots (ms)=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Rack-local map tasks=5
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Launched map tasks=7
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Data-local map tasks=2
> > 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Failed map tasks=1
> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Output Format Counters
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Written=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:   FileSystemCounters
> > 14/05/13 10:27:13 INFO mapred.JobClient:     HDFS_BYTES_READ=459
> > 14/05/13 10:27:13 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=189077
> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Input Format Counters
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Read=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:   Map-Reduce Framework
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map input records=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Physical memory (bytes)
> > snapshot=363053056
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Spilled Records=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:     CPU time spent (ms)=2110
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total committed heap usage
> > (bytes)=553517056
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Virtual memory (bytes)
> > snapshot=2344087552
> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map output records=0
> > 14/05/13 10:27:13 INFO mapred.JobClient:     SPLIT_RAW_BYTES=404
> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Transferred 459 bytes in
> > 30.0642 seconds (15.2673 bytes/sec)
> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Exported 0 records.
> > 14/05/13 10:27:13 ERROR tool.ExportTool: Error during export: Export job
> > failed!
> >
> >
> >
> > On Mon, May 12, 2014 at 10:44 PM, Jarek Jarcec Cecho <jarcec@apache.org
> >wrote:
> >
> > > The map task log contain entire executed query and lot of additional
> > > information and hence it's supper useful in such cases.
> > >
> > > Jarcec
> > >
> > > On Mon, May 12, 2014 at 02:59:56PM -0300, Leonardo Brambilla wrote:
> > > > Hi Jarek,
> > > >
> > > > thanks for replying, I don't have the logs. I'll see if I can run the
> > > task
> > > > again and then keep the logs.
> > > >
> > > > Anyway, I don't remember seeing anything else than this SQLException
> > > about
> > > > missing parameter.
> > > >
> > > > Leo
> > > >
> > > >
> > > > On Sun, May 11, 2014 at 10:59 AM, Jarek Jarcec Cecho <
> jarcec@apache.org
> > > >wrote:
> > > >
> > > > > Hi Leonardo,
> > > > > would you mind sharing with us task log from the failed map task?
> > > > >
> > > > > Jarcec
> > > > >
> > > > > On Sun, May 11, 2014 at 10:33:11AM -0300, Leonardo Brambilla wrote:
> > > > > > Hello, I am struggling to make it work, what is a really required
> > > > > feature.
> > > > > >
> > > > > > I have a process that daily generates new data, this data needs
> to be
> > > > > > pushed to a table in Oracle, the table might already have same
> data
> > > from
> > > > > > previous loads. I need to avoid duplicating data on it. Pretty
> common
> > > > > > scenario right? =)
> > > > > >
> > > > > > I am using sqoop export for this, no special arguments, just
> columns,
> > > > > > fields-terminated-by, table and db connection, plus the argument
> > > > > > "update-mode allowinsert".
> > > > > >
> > > > > > Now, when I also include the argument "update-key" with a comma
> > > separated
> > > > > > list of fields (which is the same for arg columns) I get the
> > > following
> > > > > > oracle driver error:
> > > > > >
> > > > > > 14/05/07 16:00:03 INFO mapred.JobClient: Task Id :
> > > > > > attempt_201404190827_0928_m_000003_2, Status : FAILED
> > > > > > java.io.IOException: Can't export data, please check task tracker
> > > logs
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
> > > > > >         at
> > > > > >
> > >
> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
> > > > > >         at
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> > > > > >         at
> > > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> > > > > >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > > > > >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > > > > >         at java.security.AccessController.doPrivileged(Native
> Method)
> > > > > >         at javax.security.auth.Subject.doAs(Subject.java:415)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> > > > > >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > > > > > Caused by: java.io.IOException: java.sql.SQLException: Missing
> IN or
> > > OUT
> > > > > > parameter at index:: 4
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> > > > > >         at
> > > > > >
> > >
> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
> > > > > >         ... 10 more
> > > > > > *Caused by: java.sql.SQLException: Missing IN or OUT parameter
at
> > > > > index:: 4*
> > > > > >         at
> > > > > >
> > > > >
> > >
> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
> > > > > >         at
> > > > > >
> > > > >
> > >
> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
> > > > > >         at
> > > > > >
> > > > >
> > >
> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
> > > > > >         at
> > > > > >
> > > > >
> > >
> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:215)
> > > > > >         ... 14 more
> > > > > >
> > > > > > I'm using Sqoop 1.4.3 with hadoop1, also tried 1.4.4 with same
> > > result. I
> > > > > > have the standard Oracle JDBC driver 6 with Java 7.
> > > > > > I went through all the documentation, Sqoop user guide says
this
> is
> > > > > > supported for built-in connector which I understand I am using.
> > > > > >
> > > > > > Here is the full command:
> > > > > > $sqoopExecutable export \
> > > > > > --outdir $outdir \
> > > > > > --connect $connectionString --table $table_client --username
> $dbUser
> > > > > > --password $dbUserPasswd \
> > > > > > --columns CLIENT_ID,EXP_ID,BUCKET_ID --update-key
> > > > > > CLIENT_ID,EXP_ID,BUCKET_ID \
> > > > > > --fields-terminated-by '\t' --update-mode allowinsert \
> > > > > > --export-dir $dataSource_client > $sqoopLog 2>&1
> > > > > >
> > > > > > Can someone please shed some light on this?
> > > > > > Thank you in advance.
> > > > > >
> > > > > > Leo
> > > > >
> > >
>

Mime
View raw message