samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ke Wu <ke.wu...@gmail.com>
Subject Re: [DISCUSS] SEP-23: Simplify Job Runner
Date Tue, 03 Dec 2019 00:42:42 GMT
Hi Xinyu,

Please see the response in line:

>   1. After this change, seems the original config-factory and config-path
>   are only used to supply parameters for submitting job. Is that the case?
>   Which configs are still needed in the submission?

Yes, only configs related to job submission is needed.

job.name, job.factory.class & yarn.package.path are the minimum three configs needed for
the job submission, which may be supplied by --config instead. 

>   2. For backward compatibility, does it still work if the user doesn't
>   specify the new ConfigLoader in the command line? The
>   PropertiesConfigLoader class seems requiring the path of the config after
>   exploding the tgz.

If the user does not specify config loader in the config, then it will work in the previous
flow, where runner publishes configs in coordinator stream and job coordinator/application
master will pick it up by reading from Kafka. So this is a backward compatible change.

>   3. If the final plan is to remove the original config factory/path, how
>   do we pass the parameters needed for Yarn submission, e.g. job name, id,
>   and tgz path?

We can either pass them by --config or introduce delicate command line arguments for it in
CommandLine.scala.


Let me know if you have any further questions.

Best,
Ke

> On Nov 27, 2019, at 11:02 AM, Xinyu Liu <xinyuliu.us@gmail.com> wrote:
> 
> Thanks a lot for putting out the design for simplifying the job submission
> process. The motivation makes sense to me that most of the planning and
> config generation should be done after submitting to the cluster, instead
> of during the submission, which can happen in a local sandbox without the
> access to the resources needed for planning. It also improves the process
> from the security stand of the view.
> 
> A few questions regarding to the interface changes:
> 
>   1. After this change, seems the original config-factory and config-path
>   are only used to supply parameters for submitting job. Is that the case?
>   Which configs are still needed in the submission?
>   2. For backward compatibility, does it still work if the user doesn't
>   specify the new ConfigLoader in the command line? The
>   PropertiesConfigLoader class seems requiring the path of the config after
>   exploding the tgz.
>   3. If the final plan is to remove the original config factory/path, how
>   do we pass the parameters needed for Yarn submission, e.g. job name, id,
>   and tgz path?
> 
> Thanks,
> Xinyu
> 
> On Fri, Nov 15, 2019 at 3:00 PM Ke Wu <ke.wu.cs@gmail.com> wrote:
> 
>> We created SEP-23: Simplify Job Runner, which simplifies job runner by
>> moving config retrieval and planning to AM.
>> 
>> Please find out the SEP wiki below:
>> 
>> https://cwiki.apache.org/confluence/display/SAMZA/SEP-23%3A+Simplify+Job+Runner
>> 
>> Please take a look and chime in your thoughts.
>> 
>> Thanks,
>> Ke
>> 


Mime
View raw message