sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Mesterhazy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3252) Wrong AVRO schema on import of DATE column with oraoop Connector
Date Tue, 18 Dec 2018 19:50:00 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16724421#comment-16724421

Joseph Mesterhazy commented on SQOOP-3252:

We are having the same problem. We're OK with the AVRO type being long, but there is no way
to know that this LONG long represents a timestamp. Sqoop should include the logical type
annotation in the Avro schema, eg. timestamp-micros.


> Wrong AVRO schema on import of DATE column with oraoop Connector
> ----------------------------------------------------------------
>                 Key: SQOOP-3252
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3252
>             Project: Sqoop
>          Issue Type: Bug
>          Components: codegen, connectors/oracle
>    Affects Versions: 1.4.6
>         Environment: Hortonworks HDP
>            Reporter: Johannes Mayer
>            Priority: Critical
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> I have found a bug, when importing a table from an oracle database using the oracle connector
for hadoop.  
> The sqoop command looks like this: 
> {code:java}
> sqoop import -Dmapreduce.job.user.classpath.first=true -Doraoop.timestamp.string=true
--connect jdbc:oracle:thin:@oracle_server --username USER --password xxx --as-avrodatafile
--direct --table TEST_DATE --target-dir /user/admin/test_date
> {code}
> The generated AVRO schema for the date field is [null, long]. This is why an error is
thrown when sqoop tries to write the date string to that field. The workaround for this problem
is to add a mapping for this column:
> {code:java}
> --map-column-java DATE_COL=String
> {code}
> But since I want to build a general solution adding a mapping for each DATE column is
not an option. 
> The type of the generated AVRO schema for DATE Columns should be [null, string] when
using the Oracle connector for Hadoop and when oraoop.timestamp.string=true.

This message was sent by Atlassian JIRA

View raw message