sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leonardo Brambilla <lbrambi...@contractor.elance-odesk.com>
Subject Re: Fwd: Sqoop export not working when using "update-key"
Date Fri, 18 Jul 2014 15:23:11 GMT
Hello some more findings, could this bug be related to my problem?
https://issues.apache.org/jira/browse/SQOOP-824
I know it says it's fixed since 1.4.3 but maybe that brought some other
case.
I still don't understand how to run export command without specifying the
--columns parameter, can you tell me what is the default behavior when you
omit --columns? Does the source file need to have the same column order
than the target table?

Thanks


On Fri, Jul 18, 2014 at 12:46 AM, Leonardo Brambilla <
lbrambilla@contractor.elance-odesk.com> wrote:

> I think I found something. The java class generated when using update-key
> differs from the one without update-key. The one that throws exception is
> missing to write the fields that are not specified in the update-key. I
> also see (with --verbose) that when using --update-key there is an extra
> debug line that says
> *14/07/17 22:53:27 DEBUG orm.ClassWriter: db write column order:*
> *14/07/17 22:53:27 DEBUG orm.ClassWriter:   SEARCH_DATE*
>
> Below is the method generated for the command *without* --update-key
>
>   public int write(PreparedStatement __dbStmt, int __off) throws
> SQLException {
>     JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93,
> __dbStmt);
>     JdbcWritableBridge.writeString(SEARCH_TYPE, 2 + __off, 12, __dbStmt);
>     JdbcWritableBridge.writeString(USER_AGENT, 3 + __off, 12, __dbStmt);
>     JdbcWritableBridge.writeString(SRCH_KEYWORD, 4 + __off, 12, __dbStmt);
>     JdbcWritableBridge.writeBigDecimal(SRCH_COUNT, 5 + __off, 2, __dbStmt);
>     return 5;
>   }
>
> Below is the one generated for the command *with* --update-key
>   public int write(PreparedStatement __dbStmt, int __off) throws
> SQLException {
>     JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93,
> __dbStmt);
>     return 1;
>   }
>
> I tried to force export to use the properly generated class with
> parameters "jar-file" and "class-name" but that didn't work, if like those
> params are not allowed in the export command. This is what I tried to force
> using the properly generated source
>  *sqoop export \*
> *--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
> <http://jdbc:oracle%3Athin%3A@ddb04.local.com:1541/test04> \*
> *--update-key "SEARCH_DATE" \*
> *--columns $columns \*
> *--table $table --username $user --password $passwd \*
> *--fields-terminated-by "=" --export-dir $exportDir*
> *--jar-file SEARCH_TABLE.jar --class-name SEARCH_TABLE*
>
>
>
> On Thu, Jul 17, 2014 at 5:04 PM, Leonardo Brambilla <
> lbrambilla@contractor.elance-odesk.com> wrote:
>
>> Yes, the update-key is a subset of columns.
>>
>>
>> On Thu, Jul 17, 2014 at 4:16 PM, Gwen Shapira <gshapira@cloudera.com>
>> wrote:
>>
>>> Does the update column appear in $columns? It should be in there.
>>>
>>>
>>> On Thu, Jul 17, 2014 at 10:48 AM, Leonardo Brambilla <
>>> lbrambilla@contractor.elance-odesk.com> wrote:
>>>
>>>> Hi Gwen, thank you for replying.
>>>>
>>>> I went to the data node, the userlogs and all I found in syslog file is
>>>> what I already posted:
>>>> 2014-07-17 10:19:09,280 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2014-07-17 10:19:09,700 INFO org.apache.hadoop.util.ProcessTree: setsid
>>>> exited with exit code 0
>>>> 2014-07-17 10:19:09,706 INFO org.apache.hadoop.mapred.Task:  Using
>>>> ResourceCalculatorPlugin :
>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@34c3a7c0
>>>> 2014-07-17 10:19:10,266 INFO
>>>> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
>>>> finished. keepGoing=false
>>>> 2014-07-17 10:19:10,476 INFO
>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO:
>>>> Initialized cache for UID to User mapping with a cache timeout of 14400
>>>> seconds.
>>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO:
>>>> Got UserName elance for UID 666 from the native implementation
>>>> 2014-07-17 10:19:10,539 ERROR
>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>> as:elance cause:java.io.IOException: java.sql.SQLException: Missing IN or
>>>> OUT parameter at index:: 2
>>>> 2014-07-17 10:19:10,540 WARN org.apache.hadoop.mapred.Child: Error
>>>> running child
>>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
>>>> at index:: 2
>>>>  at
>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>> at
>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>  at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>  at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>  at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>> Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>>>> index:: 2
>>>>  at
>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>> at
>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>  at
>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>> at
>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>  at
>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>> at
>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>  ... 8 more
>>>> 2014-07-17 10:19:10,543 INFO org.apache.hadoop.mapred.Task: Runnning
>>>> cleanup for the task
>>>>
>>>> There isn't more data than that.
>>>> Can you please check my sqoop command and validate that I'm using the
>>>> proper arguments? The argument "--columns" is used in export to tell sqoop
>>>> the order in which it should read the columns from the file right?
>>>> Does the last column need to have delimiter too?
>>>> The source file should be ok, have in mind that it works for insert but
>>>> fails when I add the parameter --update-key
>>>>
>>>> Thanks
>>>> Leo
>>>>
>>>>
>>>> On Thu, Jul 17, 2014 at 1:52 PM, Gwen Shapira <gshapira@cloudera.com>
>>>> wrote:
>>>>
>>>>> I can confirm that Sqoop export update works on Oracle, both with and
>>>>> without Oraoop.
>>>>>
>>>>> The specific exception you are getting indicates that Oracle expects
>>>>> at least 4 columns of data and the HDFS file may have less than that.
>>>>>
>>>>> Can you double check that the columns in Oracle and your data file
>>>>> match? And that you are using a correct delimiter?
>>>>>
>>>>> And as Jarcec said, if you have access to the Task Tracker user logs
>>>>> for one of the mappers, you'll have much more details to work with -
for
>>>>> example the specific line that failed.
>>>>>
>>>>> Gwen
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 17, 2014 at 7:44 AM, Leonardo Brambilla <
>>>>> lbrambilla@contractor.elance-odesk.com> wrote:
>>>>>
>>>>>> Hello Jarek,
>>>>>>
>>>>>> I'm getting back to this issue, I'm trying to fix it by using Oraoop
>>>>>> but that doesn't avoid the exception:
>>>>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>> parameter at index:: 4
>>>>>>
>>>>>> I ran a couple of tests and I can tell that the following command
>>>>>> works to insert new rows:
>>>>>> *sqoop export \*
>>>>>> *--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
>>>>>> <http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*
>>>>>>
>>>>>> *--columns $columns \*
>>>>>> *--table $table --username $user --password $passwd \*
>>>>>> *--fields-terminated-by "=" --export-dir $exportDir*
>>>>>>
>>>>>> But the following command (just added --update-key) throws an
>>>>>> exception:
>>>>>> *sqoop export \*
>>>>>> *--connect jdbc:oracle:thin:@ddb04.local.com:1541/test04
>>>>>> <http://jdbc:oracle:thin:@ddb04.local.com:1541/test04> \*
>>>>>> *--update-key "SEARCH_DATE" \*
>>>>>> *--columns $columns \*
>>>>>> *--table $table --username $user --password $passwd \*
>>>>>> *--fields-terminated-by "=" --export-dir $exportDir*
>>>>>>
>>>>>> DB is oracle 11.2.0.2.0
>>>>>> Sqoop is 1.4.4
>>>>>> Java 1.7
>>>>>> Oraoop 1.6
>>>>>> Oracle jdbc driver "ojdb6c.jar" implementation version 11.2.0.3.0
>>>>>>
>>>>>> Like I said before, all the log I can get from the failed task I
>>>>>> already posted here.
>>>>>>
>>>>>> Can you confirm that Sqoop export update works on Oracle DBs?
>>>>>> Thanks in advance
>>>>>> Leo
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, May 16, 2014 at 4:51 PM, Jarek Jarcec Cecho <
>>>>>> jarcec@apache.org> wrote:
>>>>>>
>>>>>>> Hi Leonardo,
>>>>>>> sadly the Sqoop output might not be that much helpful in this
case,
>>>>>>> could you please share with us the failed map task log?
>>>>>>>
>>>>>>> The easiest way how to get it on Hadoop 1.x is to open the job
>>>>>>> tracker webinterface, find the failed Sqoop job and navigate
to the failed
>>>>>>> map tasks.
>>>>>>>
>>>>>>> Jarcec
>>>>>>>
>>>>>>> On Tue, May 13, 2014 at 11:36:34AM -0300, Leonardo Brambilla
wrote:
>>>>>>> > Hi Jarek, find below the full sqoop generated log. I went
through
>>>>>>> all the
>>>>>>> > Cluster's nodes for this task logs and there is nothing
more than
>>>>>>> this same
>>>>>>> > error. I really don't know what else to look for.
>>>>>>> >
>>>>>>> > Thanks
>>>>>>> >
>>>>>>> >
>>>>>>> > Warning: /usr/lib/hbase does not exist! HBase imports will
fail.
>>>>>>> > Please set $HBASE_HOME to the root of your HBase installation.
>>>>>>> > 14/05/13 10:26:41 WARN tool.BaseSqoopTool: Setting your
password
>>>>>>> on the
>>>>>>> > command-line is insecure. Consider using -P instead.
>>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Using default
fetchSize
>>>>>>> of 1000
>>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone
has been
>>>>>>> set to GMT
>>>>>>> > 14/05/13 10:26:41 INFO tool.CodeGenTool: Beginning code
generation
>>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone
has been
>>>>>>> set to GMT
>>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Executing SQL
>>>>>>> statement: SELECT
>>>>>>> > t.* FROM etl.EXPT_SPAM_RED_JOB t WHERE 1=0
>>>>>>> > 14/05/13 10:26:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME
>>>>>>> is
>>>>>>> > /home/elance/hadoop
>>>>>>> > Note:
>>>>>>> >
>>>>>>> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl_EXPT_SPAM_RED_JOB.java
>>>>>>> > uses or overrides a deprecated API.
>>>>>>> > Note: Recompile with -Xlint:deprecation for details.
>>>>>>> > 14/05/13 10:26:42 INFO orm.CompilationManager: Writing jar
file:
>>>>>>> >
>>>>>>> /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl.EXPT_SPAM_RED_JOB.jar
>>>>>>> > 14/05/13 10:26:42 INFO mapreduce.ExportJobBase: Beginning
export of
>>>>>>> > etl.EXPT_SPAM_RED_JOB
>>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input
paths to
>>>>>>> process
>>>>>>> > : 1
>>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input
paths to
>>>>>>> process
>>>>>>> > : 1
>>>>>>> > 14/05/13 10:26:44 INFO mapred.JobClient: Running job:
>>>>>>> job_201404190827_0998
>>>>>>> > 14/05/13 10:26:45 INFO mapred.JobClient:  map 0% reduce
0%
>>>>>>> > 14/05/13 10:26:53 INFO mapred.JobClient:  map 25% reduce
0%
>>>>>>> > 14/05/13 10:26:54 INFO mapred.JobClient:  map 75% reduce
0%
>>>>>>> > 14/05/13 10:26:55 INFO mapred.JobClient: Task Id :
>>>>>>> > attempt_201404190827_0998_m_000001_0, Status : FAILED
>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or
OUT
>>>>>>> parameter at
>>>>>>> > index:: 4
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>>> >         at
>>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>>> Method)
>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
at
>>>>>>> index:: 4
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>>> >         ... 8 more
>>>>>>> >
>>>>>>> > 14/05/13 10:27:00 INFO mapred.JobClient: Task Id :
>>>>>>> > attempt_201404190827_0998_m_000001_1, Status : FAILED
>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or
OUT
>>>>>>> parameter at
>>>>>>> > index:: 4
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>>> >         at
>>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>>> Method)
>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
at
>>>>>>> index:: 4
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>>> >         ... 8 more
>>>>>>> >
>>>>>>> > 14/05/13 10:27:05 INFO mapred.JobClient: Task Id :
>>>>>>> > attempt_201404190827_0998_m_000001_2, Status : FAILED
>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or
OUT
>>>>>>> parameter at
>>>>>>> > index:: 4
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>>> >         at
>>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>>> Method)
>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter
at
>>>>>>> index:: 4
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>> >         at
>>>>>>> >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>> >         at
>>>>>>> >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>>> >         ... 8 more
>>>>>>> >
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Job complete:
>>>>>>> job_201404190827_0998
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Counters: 20
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Job Counters
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>>>>>>> SLOTS_MILLIS_MAPS=30548
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time
spent by
>>>>>>> all
>>>>>>> > reduces waiting after reserving slots (ms)=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time
spent by
>>>>>>> all maps
>>>>>>> > waiting after reserving slots (ms)=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Rack-local
map tasks=5
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Launched map
tasks=7
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Data-local
map tasks=2
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Failed map
tasks=1
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Output Format
>>>>>>> Counters
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Written=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   FileSystemCounters
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     HDFS_BYTES_READ=459
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>>>>>>> FILE_BYTES_WRITTEN=189077
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Input Format
>>>>>>> Counters
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Read=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Map-Reduce Framework
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map input records=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Physical memory
>>>>>>> (bytes)
>>>>>>> > snapshot=363053056
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Spilled Records=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     CPU time spent
>>>>>>> (ms)=2110
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total committed
heap
>>>>>>> usage
>>>>>>> > (bytes)=553517056
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Virtual memory
(bytes)
>>>>>>> > snapshot=2344087552
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map output
records=0
>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SPLIT_RAW_BYTES=404
>>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Transferred
459
>>>>>>> bytes in
>>>>>>> > 30.0642 seconds (15.2673 bytes/sec)
>>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Exported
0 records.
>>>>>>> > 14/05/13 10:27:13 ERROR tool.ExportTool: Error during export:
>>>>>>> Export job
>>>>>>> > failed!
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > On Mon, May 12, 2014 at 10:44 PM, Jarek Jarcec Cecho <
>>>>>>> jarcec@apache.org>wrote:
>>>>>>> >
>>>>>>> > > The map task log contain entire executed query and
lot of
>>>>>>> additional
>>>>>>> > > information and hence it's supper useful in such cases.
>>>>>>> > >
>>>>>>> > > Jarcec
>>>>>>> > >
>>>>>>> > > On Mon, May 12, 2014 at 02:59:56PM -0300, Leonardo
Brambilla
>>>>>>> wrote:
>>>>>>> > > > Hi Jarek,
>>>>>>> > > >
>>>>>>> > > > thanks for replying, I don't have the logs. I'll
see if I can
>>>>>>> run the
>>>>>>> > > task
>>>>>>> > > > again and then keep the logs.
>>>>>>> > > >
>>>>>>> > > > Anyway, I don't remember seeing anything else
than this
>>>>>>> SQLException
>>>>>>> > > about
>>>>>>> > > > missing parameter.
>>>>>>> > > >
>>>>>>> > > > Leo
>>>>>>> > > >
>>>>>>> > > >
>>>>>>> > > > On Sun, May 11, 2014 at 10:59 AM, Jarek Jarcec
Cecho <
>>>>>>> jarcec@apache.org
>>>>>>> > > >wrote:
>>>>>>> > > >
>>>>>>> > > > > Hi Leonardo,
>>>>>>> > > > > would you mind sharing with us task log from
the failed map
>>>>>>> task?
>>>>>>> > > > >
>>>>>>> > > > > Jarcec
>>>>>>> > > > >
>>>>>>> > > > > On Sun, May 11, 2014 at 10:33:11AM -0300,
Leonardo Brambilla
>>>>>>> wrote:
>>>>>>> > > > > > Hello, I am struggling to make it work,
what is a really
>>>>>>> required
>>>>>>> > > > > feature.
>>>>>>> > > > > >
>>>>>>> > > > > > I have a process that daily generates
new data, this data
>>>>>>> needs to be
>>>>>>> > > > > > pushed to a table in Oracle, the table
might already have
>>>>>>> same data
>>>>>>> > > from
>>>>>>> > > > > > previous loads. I need to avoid duplicating
data on it.
>>>>>>> Pretty common
>>>>>>> > > > > > scenario right? =)
>>>>>>> > > > > >
>>>>>>> > > > > > I am using sqoop export for this, no
special arguments,
>>>>>>> just columns,
>>>>>>> > > > > > fields-terminated-by, table and db connection,
plus the
>>>>>>> argument
>>>>>>> > > > > > "update-mode allowinsert".
>>>>>>> > > > > >
>>>>>>> > > > > > Now, when I also include the argument
"update-key" with a
>>>>>>> comma
>>>>>>> > > separated
>>>>>>> > > > > > list of fields (which is the same for
arg columns) I get
>>>>>>> the
>>>>>>> > > following
>>>>>>> > > > > > oracle driver error:
>>>>>>> > > > > >
>>>>>>> > > > > > 14/05/07 16:00:03 INFO mapred.JobClient:
Task Id :
>>>>>>> > > > > > attempt_201404190827_0928_m_000003_2,
Status : FAILED
>>>>>>> > > > > > java.io.IOException: Can't export data,
please check task
>>>>>>> tracker
>>>>>>> > > logs
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
>>>>>>> > > > > >         at
>>>>>>> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>>>>>>> > > > > >         at
>>>>>>> > > > >
>>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>>>>> > > > > >         at
>>>>>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>> > > > > >         at
>>>>>>> org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>> > > > > >         at
>>>>>>> java.security.AccessController.doPrivileged(Native Method)
>>>>>>> > > > > >         at
>>>>>>> javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>> > > > > >         at
>>>>>>> org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>> > > > > > Caused by: java.io.IOException: java.sql.SQLException:
>>>>>>> Missing IN or
>>>>>>> > > OUT
>>>>>>> > > > > > parameter at index:: 4
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
>>>>>>> > > > > >         ... 10 more
>>>>>>> > > > > > *Caused by: java.sql.SQLException: Missing
IN or OUT
>>>>>>> parameter at
>>>>>>> > > > > index:: 4*
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>> > > > > >         at
>>>>>>> > > > > >
>>>>>>> > > > >
>>>>>>> > >
>>>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:215)
>>>>>>> > > > > >         ... 14 more
>>>>>>> > > > > >
>>>>>>> > > > > > I'm using Sqoop 1.4.3 with hadoop1,
also tried 1.4.4 with
>>>>>>> same
>>>>>>> > > result. I
>>>>>>> > > > > > have the standard Oracle JDBC driver
6 with Java 7.
>>>>>>> > > > > > I went through all the documentation,
Sqoop user guide
>>>>>>> says this is
>>>>>>> > > > > > supported for built-in connector which
I understand I am
>>>>>>> using.
>>>>>>> > > > > >
>>>>>>> > > > > > Here is the full command:
>>>>>>> > > > > > $sqoopExecutable export \
>>>>>>> > > > > > --outdir $outdir \
>>>>>>> > > > > > --connect $connectionString --table
$table_client
>>>>>>> --username $dbUser
>>>>>>> > > > > > --password $dbUserPasswd \
>>>>>>> > > > > > --columns CLIENT_ID,EXP_ID,BUCKET_ID
--update-key
>>>>>>> > > > > > CLIENT_ID,EXP_ID,BUCKET_ID \
>>>>>>> > > > > > --fields-terminated-by '\t' --update-mode
allowinsert \
>>>>>>> > > > > > --export-dir $dataSource_client >
$sqoopLog 2>&1
>>>>>>> > > > > >
>>>>>>> > > > > > Can someone please shed some light on
this?
>>>>>>> > > > > > Thank you in advance.
>>>>>>> > > > > >
>>>>>>> > > > > > Leo
>>>>>>> > > > >
>>>>>>> > >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message