sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Veena Basavaraj (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1168) Sqoop2: Incremental From ( formerly called Incremental Import )
Date Mon, 24 Nov 2014 19:04:13 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223311#comment-14223311
] 

Veena Basavaraj commented on SQOOP-1168:
----------------------------------------

[~vinothchandar] back on this ticket full time now !

I would prefer that the implementation stays true its semantic of what "last_modified" means.
Should it not?
But it depends on how we implement this all together.

One way would be have this designed in such a way that individual connector handle the part
of writing the delta records to their data source. So if the HDFS connector ended up producing
a new file and later reconciled it would be upto it to document what its strategy is, Much
simpler for the sqoop. 


PS: AFAIK, there is a bug in the sqoop1 implementation for the last_modified and not completely
sure if it really works end-end. I


yes, I believed so that the FROM/TO state should be per job. So the Job holds FromConfig and
ToConfig objects that hold some of the state information. This can be extended to keep track
of the info per run related to incremental reading and writing. Note anything prefixed with
M is persisted

https://github.com/apache/sqoop/blob/sqoop2/common/src/main/java/org/apache/sqoop/model/MJob.java#L39

And this MFromConfig extends a ConfigList.  Each of this config already support inputs within
them. So I am debating if we can just plug a new config object to this list for both From
and To that stores the incremental info and then persist it. So there is really not much of
the repository schema change I need to do.. 

{code}
public class MFromConfig extends MConfigList {

{code}




> Sqoop2: Incremental From ( formerly called Incremental Import )
> ---------------------------------------------------------------
>
>                 Key: SQOOP-1168
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1168
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Hari Shreedharan
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.5
>
>
> Initial plan is to follow roughly the same design as Sqoop 1, except provide pluggability
to start this through a REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message