sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Veena Basavaraj (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SQOOP-1803) JobManager and Execution Engine changes: Support for a injecting and pulling out configs and job output in connectors
Date Thu, 26 Mar 2015 16:40:52 GMT

     [ https://issues.apache.org/jira/browse/SQOOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Veena Basavaraj updated SQOOP-1803:
-----------------------------------
    Attachment: SQOOP-1803-POC-2.patch

Since we decided not to use "distributed cache" for storing data, the idea of committing the
config info into a file and then reading from it when job is finished successfully is not
longer an option.

This patch does a few things
1. Introduces a config data object to be stored in context, it is a object so we can in future
add more attributes to it, it stores the data as a object with a corr type, so we dont store
a map or list or any
input type possible
2. Currently use a convention to name the configs, but we can as well ignore the key and have
a name field in the config data that the user has to fill in while persisting.
3. There may be cases where every config data stored in this "map" may not be persisted, but
if there is no use case for it, happy to remove "isPersistent boolean"
4. Since the job config update apis exist, this patch makes use of them rather than updating
the entire job

Feedback welcome

> JobManager and Execution Engine changes: Support for a injecting and pulling out configs
and job output in connectors 
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1803
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1803
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 2.0.0
>
>         Attachments: SQOOP-1803-POC-2.patch, SQOOP-1803-POC.patch
>
>
> The details are in the design wiki, as the implementation happens more discussions can
happen here.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Howtogetoutputfromconnectortosqoop?
> The goal is to dynamically inject a IncrementalConfig instance into the FromJobConfiguration.
The current MFromConfig and MToConfig can already hold a list of configs, and a strong sentiment
was expressed to keep it as a list, why not for the first time actually make use of it and
group the incremental related configs in one config object
> This task will prepare the FromJobConfiguration from the job config data, ExtractorContext
with the relevant values from the prev job run 
> This task will prepare the ToJobConfiguration from the job config data, LoaderContext
with the relevant values from the prev job run if any
> We will use DistributedCache to get State information from the Extractor and Loader out
and finally persist it into the sqoop repository depending on SQOOP-1804 once the outputcommitter
commit is called



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message