sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leonardo Brambilla <lbrambi...@contractor.elance-odesk.com>
Subject Re: Fwd: Sqoop export not working when using "update-key"
Date Fri, 18 Jul 2014 03:46:56 GMT
I think I found something. The java class generated when using update-key
differs from the one without update-key. The one that throws exception is
missing to write the fields that are not specified in the update-key. I
also see (with --verbose) that when using --update-key there is an extra
debug line that says
*14/07/17 22:53:27 DEBUG orm.ClassWriter: db write column order:*
*14/07/17 22:53:27 DEBUG orm.ClassWriter:   SEARCH_DATE*

Below is the method generated for the command *without* --update-key

  public int write(PreparedStatement __dbStmt, int __off) throws
SQLException {
    JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93, __dbStmt);
    JdbcWritableBridge.writeString(SEARCH_TYPE, 2 + __off, 12, __dbStmt);
    JdbcWritableBridge.writeString(USER_AGENT, 3 + __off, 12, __dbStmt);
    JdbcWritableBridge.writeString(SRCH_KEYWORD, 4 + __off, 12, __dbStmt);
    JdbcWritableBridge.writeBigDecimal(SRCH_COUNT, 5 + __off, 2, __dbStmt);
    return 5;
  }

Below is the one generated for the command *with* --update-key
  public int write(PreparedStatement __dbStmt, int __off) throws
SQLException {
    JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93, __dbStmt);
    return 1;
  }

I tried to force export to use the properly generated class with parameters
"jar-file" and "class-name" but that didn't work, if like those params are
not allowed in the export command. This is what I tried to force using the
properly generated source
*sqoop export \*
*--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
<http://jdbc:oracle%3Athin%3A@ddb04.local.com:1541/test04> \*
*--update-key "SEARCH_DATE" \*
*--columns $columns \*
*--table $table --username $user --password $passwd \*
*--fields-terminated-by "=" --export-dir $exportDir*
*--jar-file SEARCH_TABLE.jar --class-name SEARCH_TABLE*



On Thu, Jul 17, 2014 at 5:04 PM, Leonardo Brambilla <
lbrambilla@contractor.elance-odesk.com> wrote:

> Yes, the update-key is a subset of columns.
>
>
> On Thu, Jul 17, 2014 at 4:16 PM, Gwen Shapira <gshapira@cloudera.com>
> wrote:
>
>> Does the update column appear in $columns? It should be in there.
>>
>>
>> On Thu, Jul 17, 2014 at 10:48 AM, Leonardo Brambilla <
>> lbrambilla@contractor.elance-odesk.com> wrote:
>>
>>> Hi Gwen, thank you for replying.
>>>
>>> I went to the data node, the userlogs and all I found in syslog file is
>>> what I already posted:
>>> 2014-07-17 10:19:09,280 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2014-07-17 10:19:09,700 INFO org.apache.hadoop.util.ProcessTree: setsid
>>> exited with exit code 0
>>> 2014-07-17 10:19:09,706 INFO org.apache.hadoop.mapred.Task:  Using
>>> ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@34c3a7c0
>>> 2014-07-17 10:19:10,266 INFO
>>> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
>>> finished. keepGoing=false
>>> 2014-07-17 10:19:10,476 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO:
>>> Initialized cache for UID to User mapping with a cache timeout of 14400
>>> seconds.
>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO: Got
>>> UserName elance for UID 666 from the native implementation
>>> 2014-07-17 10:19:10,539 ERROR
>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>> as:elance cause:java.io.IOException: java.sql.SQLException: Missing IN or
>>> OUT parameter at index:: 2
>>> 2014-07-17 10:19:10,540 WARN org.apache.hadoop.mapred.Child: Error
>>> running child
>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
>>> at index:: 2
>>>  at
>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>> at
>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>  at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>>  at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>  at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.sql.SQLException: Missing IN or OUT parameter at index::
>>> 2
>>>  at
>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>> at
>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>  at
>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>> at
>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>  at
>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>> at
>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>  ... 8 more
>>> 2014-07-17 10:19:10,543 INFO org.apache.hadoop.mapred.Task: Runnning
>>> cleanup for the task
>>>
>>> There isn't more data than that.
>>> Can you please check my sqoop command and validate that I'm using the
>>> proper arguments? The argument "--columns" is used in export to tell sqoop
>>> the order in which it should read the columns from the file right?
>>> Does the last column need to have delimiter too?
>>> The source file should be ok, have in mind that it works for insert but
>>> fails when I add the parameter --update-key
>>>
>>> Thanks
>>> Leo
>>>
>>>
>>> On Thu, Jul 17, 2014 at 1:52 PM, Gwen Shapira <gshapira@cloudera.com>
>>> wrote:
>>>
>>>> I can confirm that Sqoop export update works on Oracle, both with and
>>>> without Oraoop.
>>>>
>>>> The specific exception you are getting indicates that Oracle expects at
>>>> least 4 columns of data and the HDFS file may have less than that.
>>>>
>>>> Can you double check that the columns in Oracle and your data file
>>>> match? And that you are using a correct delimiter?
>>>>
>>>> And as Jarcec said, if you have access to the Task Tracker user logs
>>>> for one of the mappers, you'll have much more details to work with - for
>>>> example the specific line that failed.
>>>>
>>>> Gwen
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 17, 2014 at 7:44 AM, Leonardo Brambilla <
>>>> lbrambilla@contractor.elance-odesk.com> wrote:
>>>>
>>>>> Hello Jarek,
>>>>>
>>>>> I'm getting back to this issue, I'm trying to fix it by using Oraoop
>>>>> but that doesn't avoid the exception:
>>>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>> parameter at index:: 4
>>>>>
>>>>> I ran a couple of tests and I can tell that the following command
>>>>> works to insert new rows:
>>>>> *sqoop export \*
>>>>> *--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
>>>>> <http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*
>>>>>
>>>>> *--columns $columns \*
>>>>> *--table $table --username $user --password $passwd \*
>>>>> *--fields-terminated-by "=" --export-dir $exportDir*
>>>>>
>>>>> But the following command (just added --update-key) throws an
>>>>> exception:
>>>>> *sqoop export \*
>>>>> *--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
>>>>> <http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*
>>>>> *--update-key "SEARCH_DATE" \*
>>>>> *--columns $columns \*
>>>>> *--table $table --username $user --password $passwd \*
>>>>> *--fields-terminated-by "=" --export-dir $exportDir*
>>>>>
>>>>> DB is oracle 11.2.0.2.0
>>>>> Sqoop is 1.4.4
>>>>> Java 1.7
>>>>> Oraoop 1.6
>>>>> Oracle jdbc driver "ojdb6c.jar" implementation version 11.2.0.3.0
>>>>>
>>>>> Like I said before, all the log I can get from the failed task I
>>>>> already posted here.
>>>>>
>>>>> Can you confirm that Sqoop export update works on Oracle DBs?
>>>>> Thanks in advance
>>>>> Leo
>>>>>
>>>>>
>>>>>
>>>>> On Fri, May 16, 2014 at 4:51 PM, Jarek Jarcec Cecho <jarcec@apache.org
>>>>> > wrote:
>>>>>
>>>>>> Hi Leonardo,
>>>>>> sadly the Sqoop output might not be that much helpful in this case,
>>>>>> could you please share with us the failed map task log?
>>>>>>
>>>>>> The easiest way how to get it on Hadoop 1.x is to open the job
>>>>>> tracker webinterface, find the failed Sqoop job and navigate to the
failed
>>>>>> map tasks.
>>>>>>
>>>>>> Jarcec
>>>>>>
>>>>>> On Tue, May 13, 2014 at 11:36:34AM -0300, Leonardo Brambilla wrote:
>>>>>> > Hi Jarek, find below the full sqoop generated log. I went through
>>>>>> all the
>>>>>> > Cluster's nodes for this task logs and there is nothing more
than
>>>>>> this same
>>>>>> > error. I really don't know what else to look for.
>>>>>> >
>>>>>> > Thanks
>>>>>> >
>>>>>> >
>>>>>> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
>>>>>> > Please set $HBASE_HOME to the root of your HBase installation.
>>>>>> > 14/05/13 10:26:41 WARN tool.BaseSqoopTool: Setting your password
on
>>>>>> the
>>>>>> > command-line is insecure. Consider using -P instead.
>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Using default fetchSize
>>>>>> of 1000
>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has
been
>>>>>> set to GMT
>>>>>> > 14/05/13 10:26:41 INFO tool.CodeGenTool: Beginning code generation
>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has
been
>>>>>> set to GMT
>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Executing SQL statement:
>>>>>> SELECT
>>>>>> > t.* FROM etl.EXPT_SPAM_RED_JOB t WHERE 1=0
>>>>>> > 14/05/13 10:26:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME
is
>>>>>> > /home/elance/hadoop
>>>>>> > Note:
>>>>>> >
>>>>>> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl_EXPT_SPAM_RED_JOB.java
>>>>>> > uses or overrides a deprecated API.
>>>>>> > Note: Recompile with -Xlint:deprecation for details.
>>>>>> > 14/05/13 10:26:42 INFO orm.CompilationManager: Writing jar file:
>>>>>> >
>>>>>> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl.EXPT_SPAM_RED_JOB.jar
>>>>>> > 14/05/13 10:26:42 INFO mapreduce.ExportJobBase: Beginning export
of
>>>>>> > etl.EXPT_SPAM_RED_JOB
>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths
to
>>>>>> process
>>>>>> > : 1
>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths
to
>>>>>> process
>>>>>> > : 1
>>>>>> > 14/05/13 10:26:44 INFO mapred.JobClient: Running job:
>>>>>> job_201404190827_0998
>>>>>> > 14/05/13 10:26:45 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>> > 14/05/13 10:26:53 INFO mapred.JobClient:  map 25% reduce 0%
>>>>>> > 14/05/13 10:26:54 INFO mapred.JobClient:  map 75% reduce 0%
>>>>>> > 14/05/13 10:26:55 INFO mapred.JobClient: Task Id :
>>>>>> > attempt_201404190827_0998_m_000001_0, Status : FAILED
>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>> parameter at
>>>>>> > index:: 4
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>> >         at
>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>> Method)
>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
at
>>>>>> index:: 4
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>> >         ... 8 more
>>>>>> >
>>>>>> > 14/05/13 10:27:00 INFO mapred.JobClient: Task Id :
>>>>>> > attempt_201404190827_0998_m_000001_1, Status : FAILED
>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>> parameter at
>>>>>> > index:: 4
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>> >         at
>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>> Method)
>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
at
>>>>>> index:: 4
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>> >         ... 8 more
>>>>>> >
>>>>>> > 14/05/13 10:27:05 INFO mapred.JobClient: Task Id :
>>>>>> > attempt_201404190827_0998_m_000001_2, Status : FAILED
>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>> parameter at
>>>>>> > index:: 4
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>> >         at
>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>> Method)
>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
at
>>>>>> index:: 4
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>> >         at
>>>>>> >
>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>> >         at
>>>>>> >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>> >         ... 8 more
>>>>>> >
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Job complete:
>>>>>> job_201404190827_0998
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Counters: 20
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Job Counters
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=30548
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent
by all
>>>>>> > reduces waiting after reserving slots (ms)=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent
by
>>>>>> all maps
>>>>>> > waiting after reserving slots (ms)=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Rack-local map
tasks=5
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Launched map tasks=7
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Data-local map
tasks=2
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Failed map tasks=1
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Output Format
>>>>>> Counters
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Written=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   FileSystemCounters
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     HDFS_BYTES_READ=459
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>>>>>> FILE_BYTES_WRITTEN=189077
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Input Format
>>>>>> Counters
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Read=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Map-Reduce Framework
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map input records=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Physical memory
(bytes)
>>>>>> > snapshot=363053056
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Spilled Records=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     CPU time spent
>>>>>> (ms)=2110
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total committed
heap
>>>>>> usage
>>>>>> > (bytes)=553517056
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Virtual memory
(bytes)
>>>>>> > snapshot=2344087552
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map output records=0
>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SPLIT_RAW_BYTES=404
>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Transferred
459
>>>>>> bytes in
>>>>>> > 30.0642 seconds (15.2673 bytes/sec)
>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Exported 0 records.
>>>>>> > 14/05/13 10:27:13 ERROR tool.ExportTool: Error during export:
>>>>>> Export job
>>>>>> > failed!
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Mon, May 12, 2014 at 10:44 PM, Jarek Jarcec Cecho <
>>>>>> jarcec@apache.org>wrote:
>>>>>> >
>>>>>> > > The map task log contain entire executed query and lot
of
>>>>>> additional
>>>>>> > > information and hence it's supper useful in such cases.
>>>>>> > >
>>>>>> > > Jarcec
>>>>>> > >
>>>>>> > > On Mon, May 12, 2014 at 02:59:56PM -0300, Leonardo Brambilla
>>>>>> wrote:
>>>>>> > > > Hi Jarek,
>>>>>> > > >
>>>>>> > > > thanks for replying, I don't have the logs. I'll see
if I can
>>>>>> run the
>>>>>> > > task
>>>>>> > > > again and then keep the logs.
>>>>>> > > >
>>>>>> > > > Anyway, I don't remember seeing anything else than
this
>>>>>> SQLException
>>>>>> > > about
>>>>>> > > > missing parameter.
>>>>>> > > >
>>>>>> > > > Leo
>>>>>> > > >
>>>>>> > > >
>>>>>> > > > On Sun, May 11, 2014 at 10:59 AM, Jarek Jarcec Cecho
<
>>>>>> jarcec@apache.org
>>>>>> > > >wrote:
>>>>>> > > >
>>>>>> > > > > Hi Leonardo,
>>>>>> > > > > would you mind sharing with us task log from
the failed map
>>>>>> task?
>>>>>> > > > >
>>>>>> > > > > Jarcec
>>>>>> > > > >
>>>>>> > > > > On Sun, May 11, 2014 at 10:33:11AM -0300, Leonardo
Brambilla
>>>>>> wrote:
>>>>>> > > > > > Hello, I am struggling to make it work,
what is a really
>>>>>> required
>>>>>> > > > > feature.
>>>>>> > > > > >
>>>>>> > > > > > I have a process that daily generates new
data, this data
>>>>>> needs to be
>>>>>> > > > > > pushed to a table in Oracle, the table might
already have
>>>>>> same data
>>>>>> > > from
>>>>>> > > > > > previous loads. I need to avoid duplicating
data on it.
>>>>>> Pretty common
>>>>>> > > > > > scenario right? =)
>>>>>> > > > > >
>>>>>> > > > > > I am using sqoop export for this, no special
arguments,
>>>>>> just columns,
>>>>>> > > > > > fields-terminated-by, table and db connection,
plus the
>>>>>> argument
>>>>>> > > > > > "update-mode allowinsert".
>>>>>> > > > > >
>>>>>> > > > > > Now, when I also include the argument "update-key"
with a
>>>>>> comma
>>>>>> > > separated
>>>>>> > > > > > list of fields (which is the same for arg
columns) I get the
>>>>>> > > following
>>>>>> > > > > > oracle driver error:
>>>>>> > > > > >
>>>>>> > > > > > 14/05/07 16:00:03 INFO mapred.JobClient:
Task Id :
>>>>>> > > > > > attempt_201404190827_0928_m_000003_2, Status
: FAILED
>>>>>> > > > > > java.io.IOException: Can't export data,
please check task
>>>>>> tracker
>>>>>> > > logs
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
>>>>>> > > > > >         at
>>>>>> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>>>>>> > > > > >         at
>>>>>> > > > >
>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>>>> > > > > >         at
>>>>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>> > > > > >         at
>>>>>> org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> > > > > >         at
>>>>>> java.security.AccessController.doPrivileged(Native Method)
>>>>>> > > > > >         at
>>>>>> javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>> > > > > >         at
>>>>>> org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>> > > > > > Caused by: java.io.IOException: java.sql.SQLException:
>>>>>> Missing IN or
>>>>>> > > OUT
>>>>>> > > > > > parameter at index:: 4
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
>>>>>> > > > > >         ... 10 more
>>>>>> > > > > > *Caused by: java.sql.SQLException: Missing
IN or OUT
>>>>>> parameter at
>>>>>> > > > > index:: 4*
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>> > > > > >         at
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > >
>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:215)
>>>>>> > > > > >         ... 14 more
>>>>>> > > > > >
>>>>>> > > > > > I'm using Sqoop 1.4.3 with hadoop1, also
tried 1.4.4 with
>>>>>> same
>>>>>> > > result. I
>>>>>> > > > > > have the standard Oracle JDBC driver 6 with
Java 7.
>>>>>> > > > > > I went through all the documentation, Sqoop
user guide says
>>>>>> this is
>>>>>> > > > > > supported for built-in connector which I
understand I am
>>>>>> using.
>>>>>> > > > > >
>>>>>> > > > > > Here is the full command:
>>>>>> > > > > > $sqoopExecutable export \
>>>>>> > > > > > --outdir $outdir \
>>>>>> > > > > > --connect $connectionString --table $table_client
>>>>>> --username $dbUser
>>>>>> > > > > > --password $dbUserPasswd \
>>>>>> > > > > > --columns CLIENT_ID,EXP_ID,BUCKET_ID --update-key
>>>>>> > > > > > CLIENT_ID,EXP_ID,BUCKET_ID \
>>>>>> > > > > > --fields-terminated-by '\t' --update-mode
allowinsert \
>>>>>> > > > > > --export-dir $dataSource_client > $sqoopLog
2>&1
>>>>>> > > > > >
>>>>>> > > > > > Can someone please shed some light on this?
>>>>>> > > > > > Thank you in advance.
>>>>>> > > > > >
>>>>>> > > > > > Leo
>>>>>> > > > >
>>>>>> > >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message