sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Szabo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3288) Incremental import's upper bound ignores session time zone in Oracle
Date Wed, 21 Feb 2018 16:19:00 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371609#comment-16371609
] 

Attila Szabo commented on SQOOP-3288:
-------------------------------------

[~dvoros],

Ohh yeah! You're right! And applying the proper timezone was fixed by me in -SQOOP-3071-...
My old memory...

Approved. Let me sign it on the review ticket as well!

Cheers,

[~maugli]

ps:
Thanks for the b part of the explanation, but that was clear already. But thanks!

> Incremental import's upper bound ignores session time zone in Oracle
> --------------------------------------------------------------------
>
>                 Key: SQOOP-3288
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3288
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/oracle
>    Affects Versions: 1.4.7
>            Reporter: Daniel Voros
>            Assignee: Daniel Voros
>            Priority: Major
>             Fix For: 1.5.0
>
>         Attachments: SQOOP-3288.1.patch
>
>
> At the moment we're using [{{SELECT SYSDATE FROM dual}}|https://github.com/apache/sqoop/blob/3153c3610da7e5db388bfb14f3681d308e9e89c6/src/java/org/apache/sqoop/manager/OracleManager.java#L652]
when getting current time from Oracle.
> SYSDATE returns the underlying operating system's current time, while CURRENT_TIMESTAMP
uses the session time zone. This could lead to problems during incremental imports *when Oracle's
time zone is different from the OS*.
> Consider the following scenario when Oracle is configured to {{+0:00}}, while the OS
is {{+5:00}}:
> ||Oracle time||OS time||Event||
> |2:00|7:00|{{sqoop import --last-value 1:00 ...}} => imports {{[1:00, 7:00)}}|
> |2:30|7:30|{{update ... set last_updated = current_timestamp ...}} => set to {{2:30}}
*Won't be imported!*|
> |3:00|8:00|{{sqoop import --last-value 7:00 ...}} => imports {{[7:00, 8:00)}}|
> This way records updated within 5 hours after the last sqoop import won't get imported.
> Please note, that the example above assumes, that the user/administrator who's updating
the Oracle table will use the current session time of Oracle when setting the "last updated"
column of the table.
> I think the solution is to use CURRENT_TIMESTAMP instead of SYSDATE. Other connection
managers, like MySQL or PostgreSQL are using that as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message