sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <gshap...@cloudera.com>
Subject Re: Fwd: Sqoop export not working when using "update-key"
Date Thu, 17 Jul 2014 16:52:12 GMT
I can confirm that Sqoop export update works on Oracle, both with and
without Oraoop.

The specific exception you are getting indicates that Oracle expects at
least 4 columns of data and the HDFS file may have less than that.

Can you double check that the columns in Oracle and your data file match?
And that you are using a correct delimiter?

And as Jarcec said, if you have access to the Task Tracker user logs for
one of the mappers, you'll have much more details to work with - for
example the specific line that failed.

Gwen




On Thu, Jul 17, 2014 at 7:44 AM, Leonardo Brambilla <
lbrambilla@contractor.elance-odesk.com> wrote:

> Hello Jarek,
>
> I'm getting back to this issue, I'm trying to fix it by using Oraoop but
> that doesn't avoid the exception:
> java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter at
> index:: 4
>
> I ran a couple of tests and I can tell that the following command works to
> insert new rows:
> *sqoop export \*
> *--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
> <http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*
>
> *--columns $columns \*
> *--table $table --username $user --password $passwd \*
> *--fields-terminated-by "=" --export-dir $exportDir*
>
> But the following command (just added --update-key) throws an exception:
> *sqoop export \*
> *--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
> <http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*
> *--update-key "SEARCH_DATE" \*
> *--columns $columns \*
> *--table $table --username $user --password $passwd \*
> *--fields-terminated-by "=" --export-dir $exportDir*
>
> DB is oracle 11.2.0.2.0
> Sqoop is 1.4.4
> Java 1.7
> Oraoop 1.6
> Oracle jdbc driver "ojdb6c.jar" implementation version 11.2.0.3.0
>
> Like I said before, all the log I can get from the failed task I already
> posted here.
>
> Can you confirm that Sqoop export update works on Oracle DBs?
> Thanks in advance
> Leo
>
>
>
> On Fri, May 16, 2014 at 4:51 PM, Jarek Jarcec Cecho <jarcec@apache.org>
> wrote:
>
>> Hi Leonardo,
>> sadly the Sqoop output might not be that much helpful in this case, could
>> you please share with us the failed map task log?
>>
>> The easiest way how to get it on Hadoop 1.x is to open the job tracker
>> webinterface, find the failed Sqoop job and navigate to the failed map
>> tasks.
>>
>> Jarcec
>>
>> On Tue, May 13, 2014 at 11:36:34AM -0300, Leonardo Brambilla wrote:
>> > Hi Jarek, find below the full sqoop generated log. I went through all
>> the
>> > Cluster's nodes for this task logs and there is nothing more than this
>> same
>> > error. I really don't know what else to look for.
>> >
>> > Thanks
>> >
>> >
>> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
>> > Please set $HBASE_HOME to the root of your HBase installation.
>> > 14/05/13 10:26:41 WARN tool.BaseSqoopTool: Setting your password on the
>> > command-line is insecure. Consider using -P instead.
>> > 14/05/13 10:26:41 INFO manager.SqlManager: Using default fetchSize of
>> 1000
>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been set to
>> GMT
>> > 14/05/13 10:26:41 INFO tool.CodeGenTool: Beginning code generation
>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been set to
>> GMT
>> > 14/05/13 10:26:41 INFO manager.SqlManager: Executing SQL statement:
>> SELECT
>> > t.* FROM etl.EXPT_SPAM_RED_JOB t WHERE 1=0
>> > 14/05/13 10:26:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
>> > /home/elance/hadoop
>> > Note:
>> >
>> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl_EXPT_SPAM_RED_JOB.java
>> > uses or overrides a deprecated API.
>> > Note: Recompile with -Xlint:deprecation for details.
>> > 14/05/13 10:26:42 INFO orm.CompilationManager: Writing jar file:
>> >
>> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl.EXPT_SPAM_RED_JOB.jar
>> > 14/05/13 10:26:42 INFO mapreduce.ExportJobBase: Beginning export of
>> > etl.EXPT_SPAM_RED_JOB
>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to
>> process
>> > : 1
>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to
>> process
>> > : 1
>> > 14/05/13 10:26:44 INFO mapred.JobClient: Running job:
>> job_201404190827_0998
>> > 14/05/13 10:26:45 INFO mapred.JobClient:  map 0% reduce 0%
>> > 14/05/13 10:26:53 INFO mapred.JobClient:  map 25% reduce 0%
>> > 14/05/13 10:26:54 INFO mapred.JobClient:  map 75% reduce 0%
>> > 14/05/13 10:26:55 INFO mapred.JobClient: Task Id :
>> > attempt_201404190827_0998_m_000001_0, Status : FAILED
>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
>> at
>> > index:: 4
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>> >         at
>> >
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>> >         at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >         at java.security.AccessController.doPrivileged(Native Method)
>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>> >         at
>> >
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>> index:: 4
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>> >         ... 8 more
>> >
>> > 14/05/13 10:27:00 INFO mapred.JobClient: Task Id :
>> > attempt_201404190827_0998_m_000001_1, Status : FAILED
>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
>> at
>> > index:: 4
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>> >         at
>> >
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>> >         at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >         at java.security.AccessController.doPrivileged(Native Method)
>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>> >         at
>> >
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>> index:: 4
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>> >         ... 8 more
>> >
>> > 14/05/13 10:27:05 INFO mapred.JobClient: Task Id :
>> > attempt_201404190827_0998_m_000001_2, Status : FAILED
>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
>> at
>> > index:: 4
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>> >         at
>> >
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>> >         at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >         at java.security.AccessController.doPrivileged(Native Method)
>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>> >         at
>> >
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>> index:: 4
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >         at
>> >
>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >         at
>> >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>> >         ... 8 more
>> >
>> > 14/05/13 10:27:13 INFO mapred.JobClient: Job complete:
>> job_201404190827_0998
>> > 14/05/13 10:27:13 INFO mapred.JobClient: Counters: 20
>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Job Counters
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=30548
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by all
>> > reduces waiting after reserving slots (ms)=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by all
>> maps
>> > waiting after reserving slots (ms)=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Rack-local map tasks=5
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Launched map tasks=7
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Data-local map tasks=2
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Failed map tasks=1
>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Output Format Counters
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Written=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:   FileSystemCounters
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     HDFS_BYTES_READ=459
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=189077
>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Input Format Counters
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Read=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Map-Reduce Framework
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map input records=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Physical memory (bytes)
>> > snapshot=363053056
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Spilled Records=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     CPU time spent (ms)=2110
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total committed heap usage
>> > (bytes)=553517056
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Virtual memory (bytes)
>> > snapshot=2344087552
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map output records=0
>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SPLIT_RAW_BYTES=404
>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Transferred 459 bytes in
>> > 30.0642 seconds (15.2673 bytes/sec)
>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Exported 0 records.
>> > 14/05/13 10:27:13 ERROR tool.ExportTool: Error during export: Export job
>> > failed!
>> >
>> >
>> >
>> > On Mon, May 12, 2014 at 10:44 PM, Jarek Jarcec Cecho <jarcec@apache.org
>> >wrote:
>> >
>> > > The map task log contain entire executed query and lot of additional
>> > > information and hence it's supper useful in such cases.
>> > >
>> > > Jarcec
>> > >
>> > > On Mon, May 12, 2014 at 02:59:56PM -0300, Leonardo Brambilla wrote:
>> > > > Hi Jarek,
>> > > >
>> > > > thanks for replying, I don't have the logs. I'll see if I can run
>> the
>> > > task
>> > > > again and then keep the logs.
>> > > >
>> > > > Anyway, I don't remember seeing anything else than this SQLException
>> > > about
>> > > > missing parameter.
>> > > >
>> > > > Leo
>> > > >
>> > > >
>> > > > On Sun, May 11, 2014 at 10:59 AM, Jarek Jarcec Cecho <
>> jarcec@apache.org
>> > > >wrote:
>> > > >
>> > > > > Hi Leonardo,
>> > > > > would you mind sharing with us task log from the failed map task?
>> > > > >
>> > > > > Jarcec
>> > > > >
>> > > > > On Sun, May 11, 2014 at 10:33:11AM -0300, Leonardo Brambilla
>> wrote:
>> > > > > > Hello, I am struggling to make it work, what is a really
>> required
>> > > > > feature.
>> > > > > >
>> > > > > > I have a process that daily generates new data, this data
needs
>> to be
>> > > > > > pushed to a table in Oracle, the table might already have
same
>> data
>> > > from
>> > > > > > previous loads. I need to avoid duplicating data on it.
Pretty
>> common
>> > > > > > scenario right? =)
>> > > > > >
>> > > > > > I am using sqoop export for this, no special arguments,
just
>> columns,
>> > > > > > fields-terminated-by, table and db connection, plus the
argument
>> > > > > > "update-mode allowinsert".
>> > > > > >
>> > > > > > Now, when I also include the argument "update-key" with
a comma
>> > > separated
>> > > > > > list of fields (which is the same for arg columns) I get
the
>> > > following
>> > > > > > oracle driver error:
>> > > > > >
>> > > > > > 14/05/07 16:00:03 INFO mapred.JobClient: Task Id :
>> > > > > > attempt_201404190827_0928_m_000003_2, Status : FAILED
>> > > > > > java.io.IOException: Can't export data, please check task
>> tracker
>> > > logs
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
>> > > > > >         at
>> > > > > >
>> > >
>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
>> > > > > >         at
>> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>> > > > > >         at
>> > > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>> > > > > >         at
>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> > > > > >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> > > > > >         at java.security.AccessController.doPrivileged(Native
>> Method)
>> > > > > >         at javax.security.auth.Subject.doAs(Subject.java:415)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> > > > > >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> > > > > > Caused by: java.io.IOException: java.sql.SQLException: Missing
>> IN or
>> > > OUT
>> > > > > > parameter at index:: 4
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>> > > > > >         at
>> > > > > >
>> > >
>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
>> > > > > >         ... 10 more
>> > > > > > *Caused by: java.sql.SQLException: Missing IN or OUT parameter
>> at
>> > > > > index:: 4*
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> > > > > >         at
>> > > > > >
>> > > > >
>> > >
>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:215)
>> > > > > >         ... 14 more
>> > > > > >
>> > > > > > I'm using Sqoop 1.4.3 with hadoop1, also tried 1.4.4 with
same
>> > > result. I
>> > > > > > have the standard Oracle JDBC driver 6 with Java 7.
>> > > > > > I went through all the documentation, Sqoop user guide says
>> this is
>> > > > > > supported for built-in connector which I understand I am
using.
>> > > > > >
>> > > > > > Here is the full command:
>> > > > > > $sqoopExecutable export \
>> > > > > > --outdir $outdir \
>> > > > > > --connect $connectionString --table $table_client --username
>> $dbUser
>> > > > > > --password $dbUserPasswd \
>> > > > > > --columns CLIENT_ID,EXP_ID,BUCKET_ID --update-key
>> > > > > > CLIENT_ID,EXP_ID,BUCKET_ID \
>> > > > > > --fields-terminated-by '\t' --update-mode allowinsert \
>> > > > > > --export-dir $dataSource_client > $sqoopLog 2>&1
>> > > > > >
>> > > > > > Can someone please shed some light on this?
>> > > > > > Thank you in advance.
>> > > > > >
>> > > > > > Leo
>> > > > >
>> > >
>>
>
>

Mime
View raw message