sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Veena Basavaraj (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SQOOP-1803) JobManager and Execution Engine changes: Support for a injecting and pulling out configs and job output in connectors
Date Tue, 17 Mar 2015 15:16:39 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365299#comment-14365299
] 

Veena Basavaraj edited comment on SQOOP-1803 at 3/17/15 3:16 PM:
-----------------------------------------------------------------

Good stepping back a bit helps.! The task is how to pass the "configs" from connector code
to repository. Lets split it into 2 things we are discussing.

1. Which in memory object do we store it in the connector created job data ?
2. At what point is it safe to persist it  to the repository?

#1. Might be good idea for you to lay out the goals behind the "Context" object. As far as
I understand its goal is to store the job data. Configs are also part of the job data and
hence I believe it can be repurposed to store configs as well. Unless otherwise there is a
"strong reason" that it cannot hold configs data. I also proposed that we can easily add a
marker to this context to indicate "transient job data" Vs "persistent job data",

#2. The output committer seems like the best place where persistence of persistent job data
needs to be stored into the repository. If there are other safe ways to do this, kindly lay
them out. We can discuss.




was (Author: vybs):
Good stepping back a bit helps.! The task is how to pass the "configs" from connector code
to repository. Lets split it into 2 things we are discussing.

1. Which in memory object do we store it in the connector code?
2. At what point is it safe to persist it  to the repository?

#1. Might be good idea for you to lay out the goals behind the "Context" object. As far as
I understand its goal is to store the job data. Configs are also part of the job data and
hence I believe it can be repurposed to store configs as well. Unless otherwise there is a
"strong reason" that it cannot hold configs data. I also proposed that we can easily add a
marker to this context to indicate "transient job data" Vs "persistet job data",

#2. The output committer seems like the best place where persistence of persistent job data
needs to be stored into the repository. If there are other safe ways to do this, kindly lay
them out. We can discuss.



> JobManager and Execution Engine changes: Support for a injecting and pulling out configs
and job output in connectors 
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1803
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1803
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.6
>
>
> The details are in the design wiki, as the implementation happens more discussions can
happen here.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Howtogetoutputfromconnectortosqoop?
> The goal is to dynamically inject a IncrementalConfig instance into the FromJobConfiguration.
The current MFromConfig and MToConfig can already hold a list of configs, and a strong sentiment
was expressed to keep it as a list, why not for the first time actually make use of it and
group the incremental related configs in one config object
> This task will prepare the FromJobConfiguration from the job config data, ExtractorContext
with the relevant values from the prev job run 
> This task will prepare the ToJobConfiguration from the job config data, LoaderContext
with the relevant values from the prev job run if any
> We will use DistributedCache to get State information from the Extractor and Loader out
and finally persist it into the sqoop repository depending on SQOOP-1804 once the outputcommitter
commit is called



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message