samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navina Ramesh (Apache)" <nav...@apache.org>
Subject Re: [VOTE] SEP-1: Semantics of ProcessorId in Samza
Date Mon, 03 Apr 2017 18:15:34 GMT
+1 (binding) from me :)

Navina

On Sun, Apr 2, 2017 at 9:31 PM, Ignacio Solis <isolis@igso.net> wrote:

> +1 (non binding)
>
> May this be the first of many SEPs...  I mean just as many as needed. :-)
>
> Nacho
>
> On Sat, Apr 1, 2017 at 1:03 PM, Kartik Paramasivam
> <kparamasivam@linkedin.com.invalid> wrote:
> > +1 (non binding)
> >
> > Great to see the SEP process being followed.
> >
> > cheers
> > Kartik
> >
> > On Thu, Mar 30, 2017 at 1:48 PM, Renato Marroquín Mogrovejo <
> > renatoj.marroquin@gmail.com> wrote:
> >
> >> Thanks for the answers Navina!
> >>
> >> +1 (non-binding)
> >>
> >> 2017-03-30 22:32 GMT+02:00 Navina Ramesh <nramesh@linkedin.com.invalid>
> :
> >>
> >> > Hi Renato,
> >> >
> >> > > Having the big proposals documented on SEPs is really great to have
> a
> >> > good understanding on the system!
> >> > I agree. Our previous design process was not being strictly enforced.
> We
> >> > hope to enforce it going forward as there are major changes coming
> into
> >> the
> >> > next release.
> >> >
> >> > > So this means that inside a container there will be a single
> processor?
> >> > StreamProcessor is nothing more than a Samza container, along with an
> >> > instance of JobCoordinator in it. Think about it as a thin-wrapper
> around
> >> > SamzaContainer and JobCoordinator instance. You can find more details
> on
> >> > this idea here - https://issues.apache.org/jira/browse/SAMZA-1063
> >> > Going forward, we want a Samza job to consist of one or more
> >> > StreamProcessors, instead of N SamzaContainers and 1 AppMaster.
> >> >
> >> > >  is this related to SAMZA-1080 somehow?
> >> > Yep. SAMZA-1080 introduces StreamProcessor with an almost pass-through
> >> > JobCoordinator. In fact, at LinkedIn, one of the teams is already
> using
> >> > this API with the StandaloneJobCoordinator and delegating partition
> >> > distribution to kafka high-level consumer (since systemconsumer is
> >> > pluggable in Samza, we have some internal wrappers around high-level
> >> > consumer). It has been working really well for stateless
> applications, I
> >> > believe.
> >> >
> >> > Cheers!
> >> > Navina
> >> >
> >> > On Thu, Mar 30, 2017 at 1:23 PM, Renato Marroquín Mogrovejo <
> >> > renatoj.marroquin@gmail.com> wrote:
> >> >
> >> > > Hi Navina,
> >> > >
> >> > > Thanks for the great proposal! Having the big proposals documented
> on
> >> > SEPs
> >> > > is really great to have a good understanding on the system!
> >> > > I have only a clarification question, the proposal states that every
> >> > > containerId is the same as the processorId. So this means that
> inside a
> >> > > container there will be a single processor? is this related to
> >> SAMZA-1080
> >> > > somehow?
> >> > >
> >> > >
> >> > > Best,
> >> > >
> >> > > Renato M.
> >> > >
> >> > > 2017-03-30 20:45 GMT+02:00 Navina Ramesh
> <nramesh@linkedin.com.invalid
> >> >:
> >> > >
> >> > > > Hi Yi,
> >> > > > Good question. Three reasons:
> >> > > >
> >> > > > 1. In SAMZA-881, we came up with a set of responsibilities for
the
> >> > > > JobCoordinator. One of them was to generate/assign processorId.
> So,
> >> it
> >> > > > makes sense to keep getProcessorId() within JobCoordinator
> interface.
> >> > > > 2. StreamProcessor was initially introduced as a user-facing
API
> >> > > > SAMZA-1080. ProcessorId was an argument in StreamProcessor
> >> constructor.
> >> > > It
> >> > > > was pushing the burden of guaranteeing unique among the processors
> >> of a
> >> > > job
> >> > > > to the user. This was not favorable.
> >> > > > 3. In general, I think we have consensus that the
> >> processorIdGenerator
> >> > is
> >> > > > going to specific to a runtime environment. Hence, it seems more
> >> > > > appropriate to move it to a lower abstraction layer that deals
> with
> >> the
> >> > > > underlying execution environment.
> >> > > >
> >> > > > Let me know if you have a different perspective on this.
> >> > > >
> >> > > > Cheers!
> >> > > > Navina
> >> > > >
> >> > > > On Thu, Mar 30, 2017 at 9:42 AM, Yi Pan <nickpan47@gmail.com>
> wrote:
> >> > > >
> >> > > > > @Navina,
> >> > > > >
> >> > > > > Sorry to chime in late. One question:
> >> > > > > 1. Why is it in JobCoordinator, and why not in StreamProcessor
> >> class?
> >> > > > > Because JobCoordinator provides coordination service across
many
> >> > > > > processors, an interface getProcessorId() in JobCoordinator
is
> >> > > confusing
> >> > > > > regarding to which processorId we are getting.
> >> > > > >
> >> > > > > Otherwise, the proposal looks good.
> >> > > > >
> >> > > > > -Yi
> >> > > > >
> >> > > > > On Wed, Mar 29, 2017 at 7:57 PM, Navina Ramesh
> >> > > > > <nramesh@linkedin.com.invalid
> >> > > > > > wrote:
> >> > > > >
> >> > > > > > Good to hear from you, Yan. Thanks! :)
> >> > > > > >
> >> > > > > > On Wed, Mar 29, 2017 at 7:48 PM, Yan Fang <
> yanfang724@gmail.com>
> >> > > > wrote:
> >> > > > > >
> >> > > > > > > +1 . Thanks for the proposal, Navina. :)
> >> > > > > > >
> >> > > > > > > Fang, Yan
> >> > > > > > > yanfang724@gmail.com
> >> > > > > > >
> >> > > > > > > On Thu, Mar 30, 2017 at 4:24 AM, Prateek Maheshwari
<
> >> > > > > > > pmaheshwari@linkedin.com.invalid> wrote:
> >> > > > > > >
> >> > > > > > > > +1 (non binding) from me.
> >> > > > > > > >
> >> > > > > > > > - Prateek
> >> > > > > > > >
> >> > > > > > > > On Tue, Mar 28, 2017 at 2:17 PM, Boris S
<
> boryas@gmail.com>
> >> > > wrote:
> >> > > > > > > >
> >> > > > > > > > > +1 Looks good to me.
> >> > > > > > > > >
> >> > > > > > > > > On Tue, Mar 28, 2017 at 2:00 PM, xinyu
liu <
> >> > > > xinyuliu.us@gmail.com>
> >> > > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > +1 on my side. Very happy to see
this proposal. This
> is a
> >> > > > blocker
> >> > > > > > for
> >> > > > > > > > > > integrating fluent API with StreamProcessor,
and
> >> hopefully
> >> > we
> >> > > > can
> >> > > > > > get
> >> > > > > > > > it
> >> > > > > > > > > > resolved soon :).
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks,
> >> > > > > > > > > > Xinyu
> >> > > > > > > > > >
> >> > > > > > > > > > On Tue, Mar 28, 2017 at 11:28 AM,
Navina Ramesh
> (Apache)
> >> <
> >> > > > > > > > > > navina@apache.org>
> >> > > > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Hi everyone,
> >> > > > > > > > > > >
> >> > > > > > > > > > > This is a voting thread for
SEP-1: Semantics of
> >> > ProcessorId
> >> > > > in
> >> > > > > > > Samza.
> >> > > > > > > > > > > For reference, here is the
wiki link:
> >> > > > > > > > > > > https://cwiki.apache.org/
> confluence/display/SAMZA/SEP-
> >> > > > > > > > > > > 1%3A+Semantics+of+ProcessorId+in+Samza
> >> > > > > > > > > > >
> >> > > > > > > > > > > Link to discussion mail thread:
> >> > > > > > > > > > > http://mail-archives.apache.
> >> > org/mod_mbox/samza-dev/201703.
> >> > > > > > > > > > > mbox/%3CCANazzuuHiO%3DvZQyFbTiYU-0Sfh3riK%3Dz4j_
> >> > > > > > > > > > AdCicQ8rBO%3DXuYQ%40mail.
> >> > > > > > > > > > > gmail.com%3E
> >> > > > > > > > > > >
> >> > > > > > > > > > > Please vote on this SEP asap.
:)
> >> > > > > > > > > > >
> >> > > > > > > > > > > Thanks!
> >> > > > > > > > > > > Navina
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Navina R.
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Navina R.
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Navina R.
> >> >
> >>
> >
> >
> >
> > --
> > We are hiring in Streams Infra (Kafka/Samza/Datastream) !!
>
>
>
> --
> Nacho - Ignacio Solis - isolis@igso.net
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message