sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-2745) Using datetime column as a splitter for Oracle no longer works
Date Thu, 17 Dec 2015 04:50:46 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061467#comment-15061467
] 

ASF subversion and git services commented on SQOOP-2745:
--------------------------------------------------------

Commit 9c7638d74180cc607ef509b9dd9e6a45ff60c041 in sqoop's branch refs/heads/trunk from [~venkatnrangan]
[ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=9c7638d ]

SQOOP-2745: Using datetime column as a splitter for Oracle no longer works
  (Jarek Jarcec Cecho via Venkat Ranganathan)


> Using datetime column as a splitter for Oracle no longer works
> --------------------------------------------------------------
>
>                 Key: SQOOP-2745
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2745
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.6
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>             Fix For: 1.4.7
>
>         Attachments: SQOOP-2745.patch
>
>
> I was recently looking into case when using Oracle connector to import data split by
datetime column ({{Date}}, {{Time}} or {{Timestamp}}) does not work and fails with error similar
to the following:
> {code}
> 2015-12-15 23:03:41,902 INFO [main] org.apache.sqoop.mapreduce.db.DBInputFormat: Using
read commited transaction isolation
> 2015-12-15 23:03:42,089 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split:
C3_TIMESTAMP >= '2015-12-12 19:21:50.0' AND C3_TIMESTAMP < '2029-08-20 08:21:58.0'
> 2015-12-15 23:03:42,238 INFO [main] org.apache.sqoop.mapreduce.db.OracleDBRecordReader:
Time zone has been set to GMT
> 2015-12-15 23:03:42,274 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Working
on split: C3_TIMESTAMP >= '2015-12-12 19:21:50.0' AND C3_TIMESTAMP < '2029-08-20 08:21:58.0'
> 2015-12-15 23:03:42,343 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Executing
query: SELECT C1_INT, C2_DATE, C3_TIMESTAMP FROM V1_ORACLE_DATE_AND_TIMESTAMP WHERE ( C3_TIMESTAMP
>= '2015-12-12 19:21:50.0' ) AND ( C3_TIMESTAMP < '2029-08-20 08:21:58.0' )
> 2015-12-15 23:03:42,394 ERROR [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Top
level exception: 
> java.sql.SQLDataException: ORA-01843: not a valid month
> 	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
> 	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
> 	at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
> 	at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
> 	at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
> 	at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
> 	at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207)
> 	at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:884)
> 	at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
> 	at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
> 	at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3584)
> 	at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3628)
> 	at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1493)
> 	at org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
> 	at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
> 	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> 	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> 2015-12-15 23:03:42,421 INFO [Thread-12] org.apache.sqoop.mapreduce.AutoProgressMapper:
Auto-progress thread is finished. keepGoing=false{code}
> I was looking into the problem and I found the root cause. Oracle connector uses custom
{{OracleDataDrivenDBInputFormat}} that [overrides|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/db/OracleDataDrivenDBInputFormat.java#L48]
{{getSplitter}} method from parent {{DataDrivenDBInputFormat}} class. This custom splitter
is essential because it ensures that we're correctly using datetime constants in generated
queries. However in SQOOP-2334 we've changed the method {{getSplitter(int)}} to {{getSplitter(int,
long)}} *without* changing the oracle connector that now overrides unused method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message