sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Veena Basavaraj (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1803) JobManager and Execution Engine changes: Support for a injecting and pulling out configs and job output in connectors
Date Fri, 13 Mar 2015 16:59:40 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360643#comment-14360643
] 

Veena Basavaraj commented on SQOOP-1803:
----------------------------------------

Here are some details of #4.
[~jarcec] Discussing the option 4 a bit more.

if we restrict the "state" to be set in the Initializer ( to begin with rather than using
the persistent store ), the current mutable context api should support it.


Related question: "MutableContext" api "may" need  to be extended to support  adding any value,
and not just a "string" current code always stores the object as a string, not sure if this
is a limitation
{code}
  public MutableMapContext() {
    this(new HashMap<String, String>());
  }
{code}

Lets take a concrete use case of GenericJdbcFromInitializer, where once the "last" value is
determined it is put into this context.
I do not see a need for any api change, since  the following can be done. This mean we pre
determine the last value before the actual mappers execute.

{code}
/**
   * Set long value for given key.
   *
   * @param key Key
   * @param value New value
   */
  public void setLong(String key, long value);

{code}

So in the initializer code we do...
{code}
context.setLong(...)
{code}

The job manager code already has access to this context object.

{code}

    // Initialize submission from the connector perspective
    initializer.initialize(initializerContext, jobRequest.getConnectorLinkConfig(direction),
        jobRequest.getJobConfig(direction));

{code}

So this means if the job succeeded we call the repository api to persist this in the repo.

{code}
      RepositoryManager.getInstance().getRepository().updateJobConfig( ...)
{code}

Let me know if this makes sense? or needs more details?

> JobManager and Execution Engine changes: Support for a injecting and pulling out configs
and job output in connectors 
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1803
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1803
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.6
>
>
> The details are in the design wiki, as the implementation happens more discussions can
happen here.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Howtogetoutputfromconnectortosqoop?
> The goal is to dynamically inject a IncrementalConfig instance into the FromJobConfiguration.
The current MFromConfig and MToConfig can already hold a list of configs, and a strong sentiment
was expressed to keep it as a list, why not for the first time actually make use of it and
group the incremental related configs in one config object
> This task will prepare the FromJobConfiguration from the job config data, ExtractorContext
with the relevant values from the prev job run 
> This task will prepare the ToJobConfiguration from the job config data, LoaderContext
with the relevant values from the prev job run if any
> We will use DistributedCache to get State information from the Extractor and Loader out
and finally persist it into the sqoop repository depending on SQOOP-1804 once the outputcommitter
commit is called



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message