samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagadish Venkatraman <jagadish1...@gmail.com>
Subject Re: [DISCUSS] SEP-23: Simplify Job Runner
Date Wed, 11 Dec 2019 03:26:58 GMT
+1 Lgtm, thanks Ke for the proposal.

On Tuesday, December 10, 2019, Prateek Maheshwari <prateekmi2@gmail.com>
wrote:

> Looks good to me as well. +1 for the overall proposal, and thanks for
> putting it together.
>
> - Prateek
>
> On Tue, Dec 10, 2019 at 1:26 PM Xinyu Liu <xinyuliu.us@gmail.com> wrote:
>
> > Thanks for updating the SEP wiki. The revised design looks clear to me.
> > Some config name might be simplified, e.g.
> > job.config.loader.properties.path.
> > Overall it looks good.
> >
> > Thanks,
> > Xinyu
> >
> > On Fri, Dec 6, 2019 at 1:52 PM Ke Wu <ke.wu.cs@gmail.com> wrote:
> >
> > > As we are revamping our config loading logic in SEP-23: Simplify Job
> > > Runner, we will introduce backward incompatible changes on the job
> > > submission logic, i.e. deprecating the usage of --config-factory and
> > > --config-path, which reads a full config upon job submission, instead,
> we
> > > will only provide job submission related config through --config and
> > > ConfigLoader will load the complete config on AM instead.
> > >
> > > Please take a look and chime in your thoughts.
> > >
> > > Best,
> > > Ke
> > >
> > > > On Dec 2, 2019, at 4:42 PM, Ke Wu <ke.wu.cs@gmail.com> wrote:
> > > >
> > > > Hi Xinyu,
> > > >
> > > > Please see the response in line:
> > > >
> > > >>  1. After this change, seems the original config-factory and
> > config-path
> > > >>  are only used to supply parameters for submitting job. Is that the
> > > case?
> > > >>  Which configs are still needed in the submission?
> > > >
> > > > Yes, only configs related to job submission is needed.
> > > >
> > > > job.name, job.factory.class & yarn.package.path are the minimum
> three
> > > configs needed for the job submission, which may be supplied by
> --config
> > > instead.
> > > >
> > > >>  2. For backward compatibility, does it still work if the user
> doesn't
> > > >>  specify the new ConfigLoader in the command line? The
> > > >>  PropertiesConfigLoader class seems requiring the path of the config
> > > after
> > > >>  exploding the tgz.
> > > >
> > > > If the user does not specify config loader in the config, then it
> will
> > > work in the previous flow, where runner publishes configs in
> coordinator
> > > stream and job coordinator/application master will pick it up by
> reading
> > > from Kafka. So this is a backward compatible change.
> > > >
> > > >>  3. If the final plan is to remove the original config factory/path,
> > how
> > > >>  do we pass the parameters needed for Yarn submission, e.g. job
> name,
> > > id,
> > > >>  and tgz path?
> > > >
> > > > We can either pass them by --config or introduce delicate command
> line
> > > arguments for it in CommandLine.scala.
> > > >
> > > >
> > > > Let me know if you have any further questions.
> > > >
> > > > Best,
> > > > Ke
> > > >
> > > >> On Nov 27, 2019, at 11:02 AM, Xinyu Liu <xinyuliu.us@gmail.com>
> > wrote:
> > > >>
> > > >> Thanks a lot for putting out the design for simplifying the job
> > > submission
> > > >> process. The motivation makes sense to me that most of the planning
> > and
> > > >> config generation should be done after submitting to the cluster,
> > > instead
> > > >> of during the submission, which can happen in a local sandbox
> without
> > > the
> > > >> access to the resources needed for planning. It also improves the
> > > process
> > > >> from the security stand of the view.
> > > >>
> > > >> A few questions regarding to the interface changes:
> > > >>
> > > >>  1. After this change, seems the original config-factory and
> > config-path
> > > >>  are only used to supply parameters for submitting job. Is that the
> > > case?
> > > >>  Which configs are still needed in the submission?
> > > >>  2. For backward compatibility, does it still work if the user
> doesn't
> > > >>  specify the new ConfigLoader in the command line? The
> > > >>  PropertiesConfigLoader class seems requiring the path of the config
> > > after
> > > >>  exploding the tgz.
> > > >>  3. If the final plan is to remove the original config factory/path,
> > how
> > > >>  do we pass the parameters needed for Yarn submission, e.g. job
> name,
> > > id,
> > > >>  and tgz path?
> > > >>
> > > >> Thanks,
> > > >> Xinyu
> > > >>
> > > >> On Fri, Nov 15, 2019 at 3:00 PM Ke Wu <ke.wu.cs@gmail.com> wrote:
> > > >>
> > > >>> We created SEP-23: Simplify Job Runner, which simplifies job runner
> > by
> > > >>> moving config retrieval and planning to AM.
> > > >>>
> > > >>> Please find out the SEP wiki below:
> > > >>>
> > > >>>
> > >
> > https://cwiki.apache.org/confluence/display/SAMZA/SEP-
> 23%3A+Simplify+Job+Runner
> > > >>>
> > > >>> Please take a look and chime in your thoughts.
> > > >>>
> > > >>> Thanks,
> > > >>> Ke
> > > >>>
> > > >
> > >
> > >
> >
>


-- 
Jagadish

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message