sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruslan Dautkhanov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3030) Export to Oracle using (--direct + --hcatalog + --columns) with columns having the wrong case sensitivity fails with NPE
Date Wed, 19 Oct 2016 03:16:58 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15587519#comment-15587519
] 

Ruslan Dautkhanov commented on SQOOP-3030:
------------------------------------------

Hi [~markuskemper@me.com],

{quote}
unfortunately --export-dir does not support due to a KiteSDK limitation.
{quote}

KiteSDK developers say it should be fixed on Sqoop side :)
See https://issues.apache.org/jira/browse/SQOOP-2907?focusedCommentId=15580833&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15580833

cc [~blue@cloudera.com]


Thank you,
Ruslan

> Export to Oracle using (--direct + --hcatalog + --columns) with columns having the wrong
case sensitivity fails with NPE
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-3030
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3030
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/oracle, hive-integration
>            Reporter: Markus Kemper
>
> It appears that if the case of the columns in the Oracle RDBMS are not used with the
Sqoop --columns option with (export + --hcatalog + --direct) the Sqoop map task will fail
with an NPE where the non-direct path fails correct.  My test case is below, please review
and let me know if you have any questions.
> Additionally it would be nice if we could detect this column case mis-match during compile
time and not submit the job only to have it fail in YARN.
> TEST CASE:
> STEP 01 : Create Oracle Source/Target Table
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "drop table
t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "create table
t1 (c1 int, c2 varchar(10))"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into
t1 values (1, 'some data')"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select *
from t1"
> -------------------------------------
> | C1                   | C2         | 
> -------------------------------------
> | 1                    | some data  | 
> -------------------------------------
> STEP 02 : Import Oracle Table using Sqoop --hcatalog options
> beeline -u jdbc:hive2:// -e "use default; drop table t1_text;"
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 --num-mappers
1 --hcatalog-database default --hcatalog-table t1_text --create-hcatalog-table --hcatalog-storage-stanza
'stored as textfile'
> beeline -u jdbc:hive2:// -e "use default; select * from t1_text;"
> Output:
> <SNIP>
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column names projected : [c1,
c2]
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column name - info map :
> 	c1 : [Type : 2,Precision : 38,Scale : 0]
> 	c2 : [Type : 12,Precision : 10,Scale : 0]
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Creating HCatalog table default.t1_text
for import
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: HCatalog Create table statement: 
> create table `default`.`t1_text` (
> 	`c1` decimal(38),
> 	`c2` varchar(10))
> stored as textfile
> <SNIP>
> 16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 34.4222 seconds
(0.3486 bytes/sec)
> 16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> ---
> +-------------+-------------+--+
> | t1_text.c1  | t1_text.c2  |
> +-------------+-------------+--+
> | 1           | some data   |
> +-------------+-------------+--+
> STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns options
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T2 --num-mappers
2 --hcatalog-database default --hcatalog-table t1_text --columns "c1,c2"
> Output: (failure = correct)
> <SNIP>
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column names projected : [c1,
c2]
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column name - info map :
> 	c1 : [Type : 2,Precision : 38,Scale : 0]
> 	c2 : [Type : 12,Precision : 10,Scale : 0]
> <SNIP>
> 16/10/18 09:13:48 INFO hive.metastore: Connected to metastore.
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: HCatalog full table schema fields = [c1,
c2]
> 16/10/18 09:13:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead,
use mapreduce.output.fileoutputformat.outputdir
> 16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog table partitioning key fields
= []
> 16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog projected schema fields = [c1,
c2]
> <SNIP>
> 16/10/18 09:14:22 INFO mapreduce.ExportJobBase: Exported 0 records.
> 16/10/18 09:14:22 ERROR tool.ExportTool: Error during export: Export job failed!
> ---
> 2016-10-18 09:14:19,418 INFO [main] org.apache.hive.hcatalog.mapreduce.InternalUtil:
Initializing org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties {transient_lastDdlTime=1476807097,
name=default.t1_text, serialization.null.format=\N, columns=c1,c2, serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
serialization.format=1, columns.types=decimal(38,0),varchar(10)}
> 2016-10-18 09:14:19,660 INFO [Thread-12] org.apache.sqoop.mapreduce.AutoProgressMapper:
Auto-progress thread is finished. keepGoing=false
> 2016-10-18 09:14:19,952 ERROR [Thread-11] org.apache.sqoop.mapreduce.AsyncSqlOutputFormat:
Got exception in update thread: java.sql.SQLSyntaxErrorException: ORA-00904: "c2": invalid
identifier
> STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns options and Oracle
--direct
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T2 --num-mappers
2 --hcatalog-database default --hcatalog-table t1_text --columns "c1,c2" --direct
> Output: (failure = not correct (NPE))
> <SNIP>
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: 
> **************************************************
> *** Using Data Connector for Oracle and Hadoop ***
> **************************************************
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database
11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC.
> 16/10/18 09:17:47 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated.
Instead, use mapreduce.map.maxattempts
> 16/10/18 09:17:47 INFO tool.CodeGenTool: Beginning code generation
> 16/10/18 09:17:47 INFO manager.SqlManager: Executing SQL statement: SELECT "C1","C2"
FROM T2 WHERE 0=1
> <SNIP>
> 16/10/18 09:17:50 INFO mapreduce.ExportJobBase: Configuring HCatalog for export job
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific details
for job
> 16/10/18 09:17:50 INFO manager.SqlManager: Executing SQL statement: SELECT "C1","C2"
FROM "T2" WHERE 1=0
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column names projected : [c1,
c2]
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column name - info map :
> 	c1 : [Type : 2,Precision : 38,Scale : 0]
> 	c2 : [Type : 12,Precision : 10,Scale : 0]
> <SNIP>
> 16/10/18 09:17:51 INFO hive.metastore: Connected to metastore.
> 16/10/18 09:17:52 INFO hcat.SqoopHCatUtilities: HCatalog full table schema fields = [c1,
c2]
> <SNIP>
> 16/10/18 09:18:25 INFO mapreduce.ExportJobBase: Exported 0 records.
> 16/10/18 09:18:25 ERROR tool.ExportTool: Error during export: Export job failed!
> ---
> 2016-10-18 09:18:23,561 INFO [main] org.apache.sqoop.manager.oracle.OraOopOutputFormatBase:
The number of rows per batch is: 100
> 2016-10-18 09:18:23,561 INFO [main] org.apache.sqoop.manager.oracle.OraOopOutputFormatBase:
The number of batches per commit is: 100
> 2016-10-18 09:18:23,721 INFO [main] org.apache.sqoop.manager.oracle.OraOopOutputFormatBase:
This record writer is connected to Oracle via the JDBC URL: 
> 	"oracle.jdbc.driver.T4CConnection@34133979"
> 	to the Oracle instance: "ORCL"
> 2016-10-18 09:18:23,828 INFO [main] org.apache.sqoop.manager.oracle.OraOopOracleQueries:
Session Time Zone set to GMT
> 2016-10-18 09:18:23,883 INFO [main] org.apache.sqoop.manager.oracle.OracleConnectionFactory:
Initializing Oracle session with SQL :
> begin 
>   dbms_application_info.set_module(module_name => 'Data Connector for Oracle and Hadoop',
action_name => 'export 20161018091747PDT'); 
> end;
> 2016-10-18 09:18:23,883 WARN [main] org.apache.sqoop.manager.oracle.OracleConnectionFactory:
No Oracle 'session initialization' statements were found to execute.
> Check that your oraoop-site-template.xml and/or oraoop-site.xml files are correctly installed
in the ${SQOOP_HOME}/conf directory.
> 2016-10-18 09:18:24,116 INFO [main] org.apache.hive.hcatalog.mapreduce.InternalUtil:
Initializing org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties {transient_lastDdlTime=1476807097,
name=default.t1_text, serialization.null.format=\N, columns=c1,c2, serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
serialization.format=1, columns.types=decimal(38,0),varchar(10)}
> 2016-10-18 09:18:24,345 INFO [Thread-12] org.apache.sqoop.mapreduce.AutoProgressMapper:
Auto-progress thread is finished. keepGoing=false
> 2016-10-18 09:18:24,357 INFO [main] org.apache.sqoop.manager.oracle.OraOopOutputFormatBase:
Batch-Mode insert statement:
> insert  into "SQOOP"."T2"
> ("C1"
> ,"C2")
> values
> (:C1
> ,:C2)
> 2016-10-18 09:18:24,358 ERROR [main] org.apache.sqoop.manager.oracle.OraOopOutputFormatInsert:
The following error occurred during configurePreparedStatement()
> java.lang.NullPointerException
> 	at org.apache.sqoop.manager.oracle.OraOopOutputFormatBase$OraOopDBRecordWriterBase.setBindValueAtName(OraOopOutputFormatBase.java:432)
> STEP 04 : Verify Export without --direct and correct column case works
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 --num-mappers
2 --hcatalog-database default --hcatalog-table t1_text --columns "C1,C2" 
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select *
from t1"
> Output:
> 16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 30.9303 seconds
(257.9025 bytes/sec)
> 16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Exported 1 records.
> ---
> -------------------------------------
> | C1                   | C2         | 
> -------------------------------------
> | 1                    | some data  | 
> | 1                    | some data  | 
> -------------------------------------
> STEP 05 : Verify Export with --direct and correct column case works
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 --num-mappers
2 --hcatalog-database default --hcatalog-table t1_text --columns "C1,C2" --direct
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select *
from t1"
> Output:
> <SNIP>
> 16/10/18 09:30:40 INFO oracle.OraOopManagerFactory: 
> **************************************************
> *** Using Data Connector for Oracle and Hadoop ***
> **************************************************
> <SNIP>
> 16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 30.9987 seconds
(257.333 bytes/sec)
> 16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Exported 1 records.
> ---
> -------------------------------------
> | C1                   | C2         | 
> -------------------------------------
> | 1                    | some data  | 
> | 1                    | some data  | 
> | 1                    | some data  | 
> -------------------------------------



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message