sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <gshap...@cloudera.com>
Subject Re: Fwd: Sqoop export not working when using "update-key"
Date Fri, 18 Jul 2014 16:53:14 GMT
Looks like Sqoop is failing to fetch column list from the table. Can
you check that $table is ALL UPPER CASE?
We are doing "select ... from user_tab_columns where table_name='$table';

Oracle stores tables as upper case, so $table has to match.

Gwen

On Fri, Jul 18, 2014 at 9:49 AM, Leonardo Brambilla
<lbrambilla@contractor.elance-odesk.com> wrote:
> Well, if I try to run the command without --columns I get an error:
> Attempted to generate class with no columns!
> Part of the log:
> 14/07/18 12:45:04 INFO tool.CodeGenTool: Beginning code generation
> 14/07/18 12:45:04 DEBUG manager.OracleManager: Using column names query:
> SELECT t.* FROM SEARCH_KEYWORDS_AGGREGATION t WHERE 1=0
> 14/07/18 12:45:04 DEBUG manager.SqlManager: Execute getColumnTypesRawQuery :
> SELECT t.* FROM SEARCH_KEYWORDS_AGGREGATION t WHERE 1=0
> 14/07/18 12:45:04 DEBUG manager.OracleManager$ConnCache: Got cached
> connection for jdbc:oracle:thin:@devbox.com:1541/devdb/uDev
> 14/07/18 12:45:04 INFO manager.OracleManager: Time zone has been set to GMT
> 14/07/18 12:45:04 DEBUG manager.SqlManager: Using fetchSize for next query:
> 1000
> 14/07/18 12:45:04 INFO manager.SqlManager: Executing SQL statement: SELECT
> t.* FROM SEARCH_KEYWORDS_AGGREGATION t WHERE 1=0
> 14/07/18 12:45:04 DEBUG manager.OracleManager$ConnCache: Caching released
> connection for jdbc:oracle:thin:@devbox.com:1541/devdb/uDev
> 14/07/18 12:45:04 DEBUG orm.ClassWriter: selected columns:
> 14/07/18 12:45:04 DEBUG orm.ClassWriter: db write column order:
> 14/07/18 12:45:04 DEBUG orm.ClassWriter:   SEARCH_DATE [from --update-key
> parameter]
> 14/07/18 12:45:04 ERROR sqoop.Sqoop: Got exception running Sqoop:
> java.lang.IllegalArgumentException: Attempted to generate class with no
> columns!
> java.lang.IllegalArgumentException: Attempted to generate class with no
> columns!
>         at
> org.apache.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:1295)
>         at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1176)
>         at
> org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
>         at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
>         at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
>         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
>         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>
> look in the log there is a "selected columns: " (empty)
>
>
> On Fri, Jul 18, 2014 at 1:13 PM, Gwen Shapira <gshapira@cloudera.com> wrote:
>>
>> If we omit --columns the source file has to match the DB in number of
>> columns, types and order.
>>
>> On Fri, Jul 18, 2014 at 8:23 AM, Leonardo Brambilla
>> <lbrambilla@contractor.elance-odesk.com> wrote:
>> > Hello some more findings, could this bug be related to my problem?
>> > https://issues.apache.org/jira/browse/SQOOP-824
>> > I know it says it's fixed since 1.4.3 but maybe that brought some other
>> > case.
>> > I still don't understand how to run export command without specifying
>> > the
>> > --columns parameter, can you tell me what is the default behavior when
>> > you
>> > omit --columns? Does the source file need to have the same column order
>> > than
>> > the target table?
>> >
>> > Thanks
>> >
>> >
>> > On Fri, Jul 18, 2014 at 12:46 AM, Leonardo Brambilla
>> > <lbrambilla@contractor.elance-odesk.com> wrote:
>> >>
>> >> I think I found something. The java class generated when using
>> >> update-key
>> >> differs from the one without update-key. The one that throws exception
>> >> is
>> >> missing to write the fields that are not specified in the update-key. I
>> >> also
>> >> see (with --verbose) that when using --update-key there is an extra
>> >> debug
>> >> line that says
>> >> 14/07/17 22:53:27 DEBUG orm.ClassWriter: db write column order:
>> >> 14/07/17 22:53:27 DEBUG orm.ClassWriter:   SEARCH_DATE
>> >>
>> >> Below is the method generated for the command without --update-key
>> >>
>> >>   public int write(PreparedStatement __dbStmt, int __off) throws
>> >> SQLException {
>> >>     JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93,
>> >> __dbStmt);
>> >>     JdbcWritableBridge.writeString(SEARCH_TYPE, 2 + __off, 12,
>> >> __dbStmt);
>> >>     JdbcWritableBridge.writeString(USER_AGENT, 3 + __off, 12,
>> >> __dbStmt);
>> >>     JdbcWritableBridge.writeString(SRCH_KEYWORD, 4 + __off, 12,
>> >> __dbStmt);
>> >>     JdbcWritableBridge.writeBigDecimal(SRCH_COUNT, 5 + __off, 2,
>> >> __dbStmt);
>> >>     return 5;
>> >>   }
>> >>
>> >> Below is the one generated for the command with --update-key
>> >>   public int write(PreparedStatement __dbStmt, int __off) throws
>> >> SQLException {
>> >>     JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93,
>> >> __dbStmt);
>> >>     return 1;
>> >>   }
>> >>
>> >> I tried to force export to use the properly generated class with
>> >> parameters "jar-file" and "class-name" but that didn't work, if like
>> >> those
>> >> params are not allowed in the export command. This is what I tried to
>> >> force
>> >> using the properly generated source
>> >> sqoop export \
>> >> --connect jdbc:oracle:thin:@ddb04.local.com:1541/test04 \
>> >> --update-key "SEARCH_DATE" \
>> >> --columns $columns \
>> >> --table $table --username $user --password $passwd \
>> >> --fields-terminated-by "=" --export-dir $exportDir
>> >> --jar-file SEARCH_TABLE.jar --class-name SEARCH_TABLE
>> >>
>> >>
>> >>
>> >> On Thu, Jul 17, 2014 at 5:04 PM, Leonardo Brambilla
>> >> <lbrambilla@contractor.elance-odesk.com> wrote:
>> >>>
>> >>> Yes, the update-key is a subset of columns.
>> >>>
>> >>>
>> >>> On Thu, Jul 17, 2014 at 4:16 PM, Gwen Shapira <gshapira@cloudera.com>
>> >>> wrote:
>> >>>>
>> >>>> Does the update column appear in $columns? It should be in there.
>> >>>>
>> >>>>
>> >>>> On Thu, Jul 17, 2014 at 10:48 AM, Leonardo Brambilla
>> >>>> <lbrambilla@contractor.elance-odesk.com> wrote:
>> >>>>>
>> >>>>> Hi Gwen, thank you for replying.
>> >>>>>
>> >>>>> I went to the data node, the userlogs and all I found in syslog file
>> >>>>> is
>> >>>>> what I already posted:
>> >>>>> 2014-07-17 10:19:09,280 INFO
>> >>>>> org.apache.hadoop.util.NativeCodeLoader:
>> >>>>> Loaded the native-hadoop library
>> >>>>> 2014-07-17 10:19:09,700 INFO org.apache.hadoop.util.ProcessTree:
>> >>>>> setsid
>> >>>>> exited with exit code 0
>> >>>>> 2014-07-17 10:19:09,706 INFO org.apache.hadoop.mapred.Task:  Using
>> >>>>> ResourceCalculatorPlugin :
>> >>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@34c3a7c0
>> >>>>> 2014-07-17 10:19:10,266 INFO
>> >>>>> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread
>> >>>>> is
>> >>>>> finished. keepGoing=false
>> >>>>> 2014-07-17 10:19:10,476 INFO
>> >>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>> >>>>> truncater
>> >>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>> >>>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO:
>> >>>>> Initialized cache for UID to User mapping with a cache timeout of
>> >>>>> 14400
>> >>>>> seconds.
>> >>>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO:
>> >>>>> Got UserName elance for UID 666 from the native implementation
>> >>>>> 2014-07-17 10:19:10,539 ERROR
>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>> >>>>> PriviledgedActionException
>> >>>>> as:elance cause:java.io.IOException: java.sql.SQLException: Missing
>> >>>>> IN or
>> >>>>> OUT parameter at index:: 2
>> >>>>> 2014-07-17 10:19:10,540 WARN org.apache.hadoop.mapred.Child: Error
>> >>>>> running child
>> >>>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT
>> >>>>> parameter
>> >>>>> at index:: 2
>> >>>>> at
>> >>>>>
>> >>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>> >>>>> at
>> >>>>>
>> >>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>> >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >>>>> at java.security.AccessController.doPrivileged(Native Method)
>> >>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> >>>>> at
>> >>>>>
>> >>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> >>>>> Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>> >>>>> index:: 2
>> >>>>> at
>> >>>>>
>> >>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >>>>> at
>> >>>>>
>> >>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >>>>> at
>> >>>>>
>> >>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >>>>> at
>> >>>>>
>> >>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >>>>> at
>> >>>>>
>> >>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >>>>> at
>> >>>>>
>> >>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>> >>>>> ... 8 more
>> >>>>> 2014-07-17 10:19:10,543 INFO org.apache.hadoop.mapred.Task: Runnning
>> >>>>> cleanup for the task
>> >>>>>
>> >>>>> There isn't more data than that.
>> >>>>> Can you please check my sqoop command and validate that I'm using
>> >>>>> the
>> >>>>> proper arguments? The argument "--columns" is used in export to tell
>> >>>>> sqoop
>> >>>>> the order in which it should read the columns from the file right?
>> >>>>> Does the last column need to have delimiter too?
>> >>>>> The source file should be ok, have in mind that it works for insert
>> >>>>> but
>> >>>>> fails when I add the parameter --update-key
>> >>>>>
>> >>>>> Thanks
>> >>>>> Leo
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Jul 17, 2014 at 1:52 PM, Gwen Shapira
>> >>>>> <gshapira@cloudera.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> I can confirm that Sqoop export update works on Oracle, both with
>> >>>>>> and
>> >>>>>> without Oraoop.
>> >>>>>>
>> >>>>>> The specific exception you are getting indicates that Oracle
>> >>>>>> expects
>> >>>>>> at least 4 columns of data and the HDFS file may have less than
>> >>>>>> that.
>> >>>>>>
>> >>>>>> Can you double check that the columns in Oracle and your data file
>> >>>>>> match? And that you are using a correct delimiter?
>> >>>>>>
>> >>>>>> And as Jarcec said, if you have access to the Task Tracker user
>> >>>>>> logs
>> >>>>>> for one of the mappers, you'll have much more details to work with
>> >>>>>> - for
>> >>>>>> example the specific line that failed.
>> >>>>>>
>> >>>>>> Gwen
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Thu, Jul 17, 2014 at 7:44 AM, Leonardo Brambilla
>> >>>>>> <lbrambilla@contractor.elance-odesk.com> wrote:
>> >>>>>>>
>> >>>>>>> Hello Jarek,
>> >>>>>>>
>> >>>>>>> I'm getting back to this issue, I'm trying to fix it by using
>> >>>>>>> Oraoop
>> >>>>>>> but that doesn't avoid the exception:
>> >>>>>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT
>> >>>>>>> parameter at index:: 4
>> >>>>>>>
>> >>>>>>> I ran a couple of tests and I can tell that the following command
>> >>>>>>> works to insert new rows:
>> >>>>>>> sqoop export \
>> >>>>>>> --connect jdbc:oracle:thin:@ddb04.local.com:1541/test04 \
>> >>>>>>> --columns $columns \
>> >>>>>>> --table $table --username $user --password $passwd \
>> >>>>>>> --fields-terminated-by "=" --export-dir $exportDir
>> >>>>>>>
>> >>>>>>> But the following command (just added --update-key) throws an
>> >>>>>>> exception:
>> >>>>>>> sqoop export \
>> >>>>>>> --connect jdbc:oracle:thin:@ddb04.local.com:1541/test04 \
>> >>>>>>> --update-key "SEARCH_DATE" \
>> >>>>>>> --columns $columns \
>> >>>>>>> --table $table --username $user --password $passwd \
>> >>>>>>> --fields-terminated-by "=" --export-dir $exportDir
>> >>>>>>>
>> >>>>>>> DB is oracle 11.2.0.2.0
>> >>>>>>> Sqoop is 1.4.4
>> >>>>>>> Java 1.7
>> >>>>>>> Oraoop 1.6
>> >>>>>>> Oracle jdbc driver "ojdb6c.jar" implementation version 11.2.0.3.0
>> >>>>>>>
>> >>>>>>> Like I said before, all the log I can get from the failed task I
>> >>>>>>> already posted here.
>> >>>>>>>
>> >>>>>>> Can you confirm that Sqoop export update works on Oracle DBs?
>> >>>>>>> Thanks in advance
>> >>>>>>> Leo
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Fri, May 16, 2014 at 4:51 PM, Jarek Jarcec Cecho
>> >>>>>>> <jarcec@apache.org> wrote:
>> >>>>>>>>
>> >>>>>>>> Hi Leonardo,
>> >>>>>>>> sadly the Sqoop output might not be that much helpful in this
>> >>>>>>>> case,
>> >>>>>>>> could you please share with us the failed map task log?
>> >>>>>>>>
>> >>>>>>>> The easiest way how to get it on Hadoop 1.x is to open the job
>> >>>>>>>> tracker webinterface, find the failed Sqoop job and navigate to
>> >>>>>>>> the failed
>> >>>>>>>> map tasks.
>> >>>>>>>>
>> >>>>>>>> Jarcec
>> >>>>>>>>
>> >>>>>>>> On Tue, May 13, 2014 at 11:36:34AM -0300, Leonardo Brambilla
>> >>>>>>>> wrote:
>> >>>>>>>> > Hi Jarek, find below the full sqoop generated log. I went
>> >>>>>>>> > through
>> >>>>>>>> > all the
>> >>>>>>>> > Cluster's nodes for this task logs and there is nothing more
>> >>>>>>>> > than
>> >>>>>>>> > this same
>> >>>>>>>> > error. I really don't know what else to look for.
>> >>>>>>>> >
>> >>>>>>>> > Thanks
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > Warning: /usr/lib/hbase does not exist! HBase imports will
>> >>>>>>>> > fail.
>> >>>>>>>> > Please set $HBASE_HOME to the root of your HBase installation.
>> >>>>>>>> > 14/05/13 10:26:41 WARN tool.BaseSqoopTool: Setting your
>> >>>>>>>> > password
>> >>>>>>>> > on the
>> >>>>>>>> > command-line is insecure. Consider using -P instead.
>> >>>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Using default
>> >>>>>>>> > fetchSize
>> >>>>>>>> > of 1000
>> >>>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has
>> >>>>>>>> > been
>> >>>>>>>> > set to GMT
>> >>>>>>>> > 14/05/13 10:26:41 INFO tool.CodeGenTool: Beginning code
>> >>>>>>>> > generation
>> >>>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has
>> >>>>>>>> > been
>> >>>>>>>> > set to GMT
>> >>>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Executing SQL
>> >>>>>>>> > statement: SELECT
>> >>>>>>>> > t.* FROM etl.EXPT_SPAM_RED_JOB t WHERE 1=0
>> >>>>>>>> > 14/05/13 10:26:41 INFO orm.CompilationManager:
>> >>>>>>>> > HADOOP_MAPRED_HOME
>> >>>>>>>> > is
>> >>>>>>>> > /home/elance/hadoop
>> >>>>>>>> > Note:
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl_EXPT_SPAM_RED_JOB.java
>> >>>>>>>> > uses or overrides a deprecated API.
>> >>>>>>>> > Note: Recompile with -Xlint:deprecation for details.
>> >>>>>>>> > 14/05/13 10:26:42 INFO orm.CompilationManager: Writing jar
>> >>>>>>>> > file:
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl.EXPT_SPAM_RED_JOB.jar
>> >>>>>>>> > 14/05/13 10:26:42 INFO mapreduce.ExportJobBase: Beginning
>> >>>>>>>> > export
>> >>>>>>>> > of
>> >>>>>>>> > etl.EXPT_SPAM_RED_JOB
>> >>>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths
>> >>>>>>>> > to
>> >>>>>>>> > process
>> >>>>>>>> > : 1
>> >>>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths
>> >>>>>>>> > to
>> >>>>>>>> > process
>> >>>>>>>> > : 1
>> >>>>>>>> > 14/05/13 10:26:44 INFO mapred.JobClient: Running job:
>> >>>>>>>> > job_201404190827_0998
>> >>>>>>>> > 14/05/13 10:26:45 INFO mapred.JobClient:  map 0% reduce 0%
>> >>>>>>>> > 14/05/13 10:26:53 INFO mapred.JobClient:  map 25% reduce 0%
>> >>>>>>>> > 14/05/13 10:26:54 INFO mapred.JobClient:  map 75% reduce 0%
>> >>>>>>>> > 14/05/13 10:26:55 INFO mapred.JobClient: Task Id :
>> >>>>>>>> > attempt_201404190827_0998_m_000001_0, Status : FAILED
>> >>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>> >>>>>>>> > parameter at
>> >>>>>>>> > index:: 4
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>> >>>>>>>> >         at
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> >>>>>>>> >         at
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>> >>>>>>>> > Method)
>> >>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> >>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
>> >>>>>>>> > at
>> >>>>>>>> > index:: 4
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>> >>>>>>>> >         ... 8 more
>> >>>>>>>> >
>> >>>>>>>> > 14/05/13 10:27:00 INFO mapred.JobClient: Task Id :
>> >>>>>>>> > attempt_201404190827_0998_m_000001_1, Status : FAILED
>> >>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>> >>>>>>>> > parameter at
>> >>>>>>>> > index:: 4
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>> >>>>>>>> >         at
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> >>>>>>>> >         at
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>> >>>>>>>> > Method)
>> >>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> >>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
>> >>>>>>>> > at
>> >>>>>>>> > index:: 4
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>> >>>>>>>> >         ... 8 more
>> >>>>>>>> >
>> >>>>>>>> > 14/05/13 10:27:05 INFO mapred.JobClient: Task Id :
>> >>>>>>>> > attempt_201404190827_0998_m_000001_2, Status : FAILED
>> >>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>> >>>>>>>> > parameter at
>> >>>>>>>> > index:: 4
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>> >>>>>>>> >         at
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> >>>>>>>> >         at
>> >>>>>>>> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>> >>>>>>>> > Method)
>> >>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> >>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
>> >>>>>>>> > at
>> >>>>>>>> > index:: 4
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >>>>>>>> >         at
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>> >>>>>>>> >         ... 8 more
>> >>>>>>>> >
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Job complete:
>> >>>>>>>> > job_201404190827_0998
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Counters: 20
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Job Counters
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>> >>>>>>>> > SLOTS_MILLIS_MAPS=30548
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent
>> >>>>>>>> > by
>> >>>>>>>> > all
>> >>>>>>>> > reduces waiting after reserving slots (ms)=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent
>> >>>>>>>> > by
>> >>>>>>>> > all maps
>> >>>>>>>> > waiting after reserving slots (ms)=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Rack-local map
>> >>>>>>>> > tasks=5
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Launched map
>> >>>>>>>> > tasks=7
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Data-local map
>> >>>>>>>> > tasks=2
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>> >>>>>>>> > SLOTS_MILLIS_REDUCES=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Failed map tasks=1
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Output Format
>> >>>>>>>> > Counters
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Written=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   FileSystemCounters
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>> >>>>>>>> > HDFS_BYTES_READ=459
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>> >>>>>>>> > FILE_BYTES_WRITTEN=189077
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Input Format
>> >>>>>>>> > Counters
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Read=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Map-Reduce Framework
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map input
>> >>>>>>>> > records=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Physical memory
>> >>>>>>>> > (bytes)
>> >>>>>>>> > snapshot=363053056
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Spilled Records=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     CPU time spent
>> >>>>>>>> > (ms)=2110
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total committed
>> >>>>>>>> > heap
>> >>>>>>>> > usage
>> >>>>>>>> > (bytes)=553517056
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Virtual memory
>> >>>>>>>> > (bytes)
>> >>>>>>>> > snapshot=2344087552
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map output
>> >>>>>>>> > records=0
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>> >>>>>>>> > SPLIT_RAW_BYTES=404
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Transferred 459
>> >>>>>>>> > bytes in
>> >>>>>>>> > 30.0642 seconds (15.2673 bytes/sec)
>> >>>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Exported 0
>> >>>>>>>> > records.
>> >>>>>>>> > 14/05/13 10:27:13 ERROR tool.ExportTool: Error during export:
>> >>>>>>>> > Export job
>> >>>>>>>> > failed!
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> >
>> >>>>>>>> > On Mon, May 12, 2014 at 10:44 PM, Jarek Jarcec Cecho
>> >>>>>>>> > <jarcec@apache.org>wrote:
>> >>>>>>>> >
>> >>>>>>>> > > The map task log contain entire executed query and lot of
>> >>>>>>>> > > additional
>> >>>>>>>> > > information and hence it's supper useful in such cases.
>> >>>>>>>> > >
>> >>>>>>>> > > Jarcec
>> >>>>>>>> > >
>> >>>>>>>> > > On Mon, May 12, 2014 at 02:59:56PM -0300, Leonardo Brambilla
>> >>>>>>>> > > wrote:
>> >>>>>>>> > > > Hi Jarek,
>> >>>>>>>> > > >
>> >>>>>>>> > > > thanks for replying, I don't have the logs. I'll see if I
>> >>>>>>>> > > > can
>> >>>>>>>> > > > run the
>> >>>>>>>> > > task
>> >>>>>>>> > > > again and then keep the logs.
>> >>>>>>>> > > >
>> >>>>>>>> > > > Anyway, I don't remember seeing anything else than this
>> >>>>>>>> > > > SQLException
>> >>>>>>>> > > about
>> >>>>>>>> > > > missing parameter.
>> >>>>>>>> > > >
>> >>>>>>>> > > > Leo
>> >>>>>>>> > > >
>> >>>>>>>> > > >
>> >>>>>>>> > > > On Sun, May 11, 2014 at 10:59 AM, Jarek Jarcec Cecho
>> >>>>>>>> > > > <jarcec@apache.org
>> >>>>>>>> > > >wrote:
>> >>>>>>>> > > >
>> >>>>>>>> > > > > Hi Leonardo,
>> >>>>>>>> > > > > would you mind sharing with us task log from the failed
>> >>>>>>>> > > > > map
>> >>>>>>>> > > > > task?
>> >>>>>>>> > > > >
>> >>>>>>>> > > > > Jarcec
>> >>>>>>>> > > > >
>> >>>>>>>> > > > > On Sun, May 11, 2014 at 10:33:11AM -0300, Leonardo
>> >>>>>>>> > > > > Brambilla
>> >>>>>>>> > > > > wrote:
>> >>>>>>>> > > > > > Hello, I am struggling to make it work, what is a
>> >>>>>>>> > > > > > really
>> >>>>>>>> > > > > > required
>> >>>>>>>> > > > > feature.
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > I have a process that daily generates new data, this
>> >>>>>>>> > > > > > data
>> >>>>>>>> > > > > > needs to be
>> >>>>>>>> > > > > > pushed to a table in Oracle, the table might already
>> >>>>>>>> > > > > > have
>> >>>>>>>> > > > > > same data
>> >>>>>>>> > > from
>> >>>>>>>> > > > > > previous loads. I need to avoid duplicating data on it.
>> >>>>>>>> > > > > > Pretty common
>> >>>>>>>> > > > > > scenario right? =)
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > I am using sqoop export for this, no special arguments,
>> >>>>>>>> > > > > > just columns,
>> >>>>>>>> > > > > > fields-terminated-by, table and db connection, plus the
>> >>>>>>>> > > > > > argument
>> >>>>>>>> > > > > > "update-mode allowinsert".
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > Now, when I also include the argument "update-key" with
>> >>>>>>>> > > > > > a
>> >>>>>>>> > > > > > comma
>> >>>>>>>> > > separated
>> >>>>>>>> > > > > > list of fields (which is the same for arg columns) I
>> >>>>>>>> > > > > > get
>> >>>>>>>> > > > > > the
>> >>>>>>>> > > following
>> >>>>>>>> > > > > > oracle driver error:
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > 14/05/07 16:00:03 INFO mapred.JobClient: Task Id :
>> >>>>>>>> > > > > > attempt_201404190827_0928_m_000003_2, Status : FAILED
>> >>>>>>>> > > > > > java.io.IOException: Can't export data, please check
>> >>>>>>>> > > > > > task
>> >>>>>>>> > > > > > tracker
>> >>>>>>>> > > logs
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > > org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > > java.security.AccessController.doPrivileged(Native
>> >>>>>>>> > > > > > Method)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > > javax.security.auth.Subject.doAs(Subject.java:415)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:249)
>> >>>>>>>> > > > > > Caused by: java.io.IOException: java.sql.SQLException:
>> >>>>>>>> > > > > > Missing IN or
>> >>>>>>>> > > OUT
>> >>>>>>>> > > > > > parameter at index:: 4
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
>> >>>>>>>> > > > > >         ... 10 more
>> >>>>>>>> > > > > > *Caused by: java.sql.SQLException: Missing IN or OUT
>> >>>>>>>> > > > > > parameter at
>> >>>>>>>> > > > > index:: 4*
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>> >>>>>>>> > > > > >         at
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>> > >
>> >>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:215)
>> >>>>>>>> > > > > >         ... 14 more
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > I'm using Sqoop 1.4.3 with hadoop1, also tried 1.4.4
>> >>>>>>>> > > > > > with
>> >>>>>>>> > > > > > same
>> >>>>>>>> > > result. I
>> >>>>>>>> > > > > > have the standard Oracle JDBC driver 6 with Java 7.
>> >>>>>>>> > > > > > I went through all the documentation, Sqoop user guide
>> >>>>>>>> > > > > > says this is
>> >>>>>>>> > > > > > supported for built-in connector which I understand I
>> >>>>>>>> > > > > > am
>> >>>>>>>> > > > > > using.
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > Here is the full command:
>> >>>>>>>> > > > > > $sqoopExecutable export \
>> >>>>>>>> > > > > > --outdir $outdir \
>> >>>>>>>> > > > > > --connect $connectionString --table $table_client
>> >>>>>>>> > > > > > --username $dbUser
>> >>>>>>>> > > > > > --password $dbUserPasswd \
>> >>>>>>>> > > > > > --columns CLIENT_ID,EXP_ID,BUCKET_ID --update-key
>> >>>>>>>> > > > > > CLIENT_ID,EXP_ID,BUCKET_ID \
>> >>>>>>>> > > > > > --fields-terminated-by '\t' --update-mode allowinsert \
>> >>>>>>>> > > > > > --export-dir $dataSource_client > $sqoopLog 2>&1
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > Can someone please shed some light on this?
>> >>>>>>>> > > > > > Thank you in advance.
>> >>>>>>>> > > > > >
>> >>>>>>>> > > > > > Leo
>> >>>>>>>> > > > >
>> >>>>>>>> > >
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>
>

Mime
View raw message