sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nate Clevenger (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SQOOP-3003) Sqoop import fails to query with split-by/boundary-query using Oracle Date/Timestamp
Date Mon, 29 Aug 2016 14:00:27 GMT
Nate Clevenger created SQOOP-3003:
-------------------------------------

             Summary: Sqoop import fails to query with split-by/boundary-query using Oracle
Date/Timestamp
                 Key: SQOOP-3003
                 URL: https://issues.apache.org/jira/browse/SQOOP-3003
             Project: Sqoop
          Issue Type: Bug
          Components: connectors/oracle
            Reporter: Nate Clevenger


Given the following example sqoop import command intended to import data from an Oracle test_table,
split-by a timestamp_column using a boundary query (e.g. one-day range) with sqoop parallelism
of eight:
{code}
sqoop import --connect jdbc:oracle:... --username <username> --password <pwd>
--target-dir /tmp/sqoop/test -m 8 --null-string '' --append --query "SELECT primary_key, TO_CHAR(timestamp_column)
FROM test_table  WHERE  primary_key != 0 AND \$CONDITIONS" --split-by "timestamp_column" --boundary-query
"SELECT TO_TIMESTAMP('1970-01-01', 'yyyy-mm-dd')+numtodsinterval(1472083200,'second'), TO_TIMESTAMP('1970-01-01',
'yyyy-mm-dd')+numtodsinterval(1472169600,'second') FROM DUAL"
{code}

The following exception is thrown by each map tasks:
{code}
Caused by: java.sql.SQLDataException: ORA-01843: not a valid month

	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447)
	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
	at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951)
	at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513)
	at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)
	at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
	at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:208)
	at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:886)
	at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175)
	at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296)
	at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3613)
	at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3657)
	at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1495)
	at org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
	at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
	... 12 more
{code}

Inspecting the source code, the issue appears to be attributed to OracleManager failing to
set the correct input format (should be [OracleDataDrivenDBInputFormat|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/OracleDataDrivenDBInputFormat.java#L47],
but appears to be getting set to [DataDrivenDBInputFormat|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/DataDrivenDBInputFormat.java#L79],
resulting in [DateSplitter|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java#L180]
being applied instead of [OracleDateSplitter|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/OracleDateSplitter.java#L30]).
 OracleManager appears to apply the correct input format when using the [\--table option|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/manager/OracleManager.java#L437]
in {{sqoop import}}, but doesn't apply a similar override when using the [\--query option|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/manager/SqlManager.java#L676],
resulting in the input format being [defaulted to DataDriveDBInputFormat|https://github.com/apache/sqoop/blob/a0b730c77e297a62909063289ef37a2b993ff5e1/src/java/org/apache/sqoop/manager/ImportJobContext.java#L42].

This defect was tested using 1.4.6-cdh5.5.2-release and 1.4.6-cdh5.6.0-release, and the affected
code issue appears to still be applicable as of the latest in trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message