sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Phelps (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SQOOP-3245) Documentation for timezone handling with Oracle and Parquet may be confusing
Date Wed, 25 Oct 2017 19:44:00 GMT
Jason Phelps created SQOOP-3245:
-----------------------------------

             Summary: Documentation for timezone handling with Oracle and Parquet may be confusing
                 Key: SQOOP-3245
                 URL: https://issues.apache.org/jira/browse/SQOOP-3245
             Project: Sqoop
          Issue Type: Improvement
            Reporter: Jason Phelps
            Priority: Minor


The current documentation does not mention that --as-parquetfile will convert all date/timestamp
types to Long format (milliseconds since epoch), while also converting to match the session
timezone. 

This can cause inconsistencies with some cases where data that is inserted in another timezone
as the host running the sqoop command differ.

In addition, the current documentation around Oracle says the below:


{code:java}
Oracle also includes the additional date/time types TIMESTAMP WITH TIMEZONE and TIMESTAMP
WITH LOCAL TIMEZONE. To support these types, the user’s session timezone must be specified.
By default, Sqoop will specify the timezone "GMT" to Oracle. You can override this setting
by specifying a Hadoop property oracle.sessionTimeZone on the command-line when running a
Sqoop job. For example:
{code}

What is not mentioned is that this is only applicable with OraOop (--direct) enabled. This
can be also be interpreted that only 'TIMESTAMP WITH TIMEZONE' and 'TIMESTAMP WITH LOCAL TIMEZONE'
will be affected, not the entire session will have the GMT timezone



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message