sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3123) Import from oracle using oraoop with map-column-java to avro fails if special characters encounter in table name or column name
Date Mon, 20 Mar 2017 16:33:41 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932989#comment-15932989
] 

ASF subversion and git services commented on SQOOP-3123:
--------------------------------------------------------

Commit e280b47eacc3428040669df5f91cedccd5be7e46 in sqoop's branch refs/heads/trunk from [~maugli]
[ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=e280b47 ]

SQOOP-3123: Introduce escaping logic for column mapping parameters (same
what Sqoop already uses for the DB column names), thus special column
names (e.g. containing '#' character) and mappings realted to those
columns can be in the same format (thus not confusing the end users), and
also eliminates the related AVRO format clashing issues.

(Dmitry Zagorulkin via Attila Szabo)


> Import from oracle using oraoop with map-column-java to avro fails if special characters
encounter in table name or column name 
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-3123
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3123
>             Project: Sqoop
>          Issue Type: Bug
>          Components: codegen
>    Affects Versions: 1.4.6, 1.4.7
>            Reporter: Dmitry Zagorulkin
>             Fix For: 1.4.7
>
>         Attachments: SQOOP_3123.patch
>
>
> I'm trying to import data from oracle to avro using oraoop.
> My table:
> {code}
> CREATE TABLE "IBS"."BRITISH#CATS"
> (    "ID" NUMBER,
>      "C_CODE" VARCHAR2(10),
>      "C_USE_START#DATE" DATE,
>      "C_USE_USE#NEXT_DAY" VARCHAR2(1),
>      "C_LIM_MIN#DAT" DATE,
>      "C_LIM_MIN#TIME" TIMESTAMP,
>      "C_LIM_MIN#SUM" NUMBER,
>      "C_OWNCODE" VARCHAR2(1),
>      "C_LIMIT#SUM_LIMIT" NUMBER(17,2),
>      "C_L@M" NUMBER(17,2),
>      "C_1_THROW" NUMBER NOT NULL ENABLE,
>      "C_#_LIMITS" NUMBER NOT NULL ENABLE
> ) SEGMENT CREATION IMMEDIATE
> PCTFREE 70 PCTUSED 40 INITRANS 2 MAXTRANS 255
> NOCOMPRESS LOGGING
> STORAGE(INITIAL 2097152 NEXT 524288 MINEXTENTS 1 MAXEXTENTS 2147483645
> PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
> BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
> TABLESPACE "WORK" ;
> {code}
> My first script is:
> {code}
> ./sqoop import \                                                                    
                             
>   -Doraoop.timestamp.string=false \
>   --direct \
>   --connect jdbc:oracle:thin:@localhost:49161:XE \
>   --username system \
>   --password oracle \
>   --table IBS.BRITISH#CATS \
>   --target-dir /Users/Dmitry/Developer/Java/sqoop/bin/imported \
>   --as-avrodatafile \
>   --map-column-java ID=String,C_CODE=String,C_USE_START#DATE=String,C_USE_USE#NEXT_DAY=String,C_LIM_MIN#DAT=String,C_LIM_MIN#TIME=String,C_LIM_MIN#SUM=String,C_OWNCODE=String,C_LIMIT#SUM_LIMIT=String,C_L_M=String,C_1_THROW=String,C_#_LIMITS=String
> {code}
> fails with
> {code}
> 2017-01-13 16:11:21,348 ERROR [main] tool.ImportTool (ImportTool.java:run(625)) - Import
failed: No column by the name C_LIMIT#SUM_LIMITfound while importing data; expecting one of
[C_LIMIT_SUM_LIMIT, C_OWNCODE, C_L_M, C___LIMITS, C_LIM_MIN_DAT, C_1_THROW, C_CODE, C_USE_START_DATE,
C_LIM_MIN_SUM, ID, C_LIM_MIN_TIME, C_USE_USE_NEXT_DAY]
> {code}
> After i've found that sqoop has replaced all special characters with underscore. My second
script is:
> {code}
> ./sqoop import \                                                                    
                             
>   -D oraoop.timestamp.string=false \
>   --direct \
>   --connect jdbc:oracle:thin:@localhost:49161:XE \
>   --username system \
>   --password oracle \
>   --table IBS.BRITISH#CATS \
>   --target-dir /Users/Dmitry/Developer/Java/sqoop/bin/imported \
>   --as-avrodatafile \
>   --map-column-java ID=String,C_CODE=String,C_USE_START_DATE=String,C_USE_USE_NEXT_DAY=String,C_LIM_MIN_DAT=String,C_LIM_MIN_TIME=String,C_LIM_MIN_SUM=String,C_OWNCODE=String,C_LIMIT_SUM_LIMIT=String,C_L_M=String,C_1_THROW=String,C___LIMITS=String
\
>   --verbose
> {code}
> Fails with: Caused by: org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]:
2017-01-13 11:22:53.0
> {code}
> 2017-01-13 16:14:54,687 WARN  [Thread-26] mapred.LocalJobRunner (LocalJobRunner.java:run(560))
- job_local1372531461_0001
> java.lang.Exception: org.apache.avro.file.DataFileWriter$AppendWriteException: org.apache.avro.UnresolvedUnionException:
Not in union ["null","long"]: 2017-01-13 11:22:53.0
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException: org.apache.avro.UnresolvedUnionException:
Not in union ["null","long"]: 2017-01-13 11:22:53.0
> 	at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308)
> 	at org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:112)
> 	at org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:108)
> 	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:655)
> 	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> 	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> 	at org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:73)
> 	at org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:39)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> 	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: 2017-01-13
11:22:53.0
> 	at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:709)
> 	at org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192)
> 	at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> 	at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150)
> 	at org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
> 	at org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:90)
> 	at org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:182)
> 	at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
> 	at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> 	at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
> 	at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302)
> 	... 17 more
> {code}
> I've found that old problem and *oraoop.timestamp.string=false* must solve it, but it
does not.
> What do you think?
> Also please assign this problem to me.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message