samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ignacio Solis <iso...@igso.net>
Subject Re: [VOTE] SEP-1: Semantics of ProcessorId in Samza
Date Mon, 03 Apr 2017 04:31:08 GMT
+1 (non binding)

May this be the first of many SEPs...  I mean just as many as needed. :-)

Nacho

On Sat, Apr 1, 2017 at 1:03 PM, Kartik Paramasivam
<kparamasivam@linkedin.com.invalid> wrote:
> +1 (non binding)
>
> Great to see the SEP process being followed.
>
> cheers
> Kartik
>
> On Thu, Mar 30, 2017 at 1:48 PM, Renato Marroquín Mogrovejo <
> renatoj.marroquin@gmail.com> wrote:
>
>> Thanks for the answers Navina!
>>
>> +1 (non-binding)
>>
>> 2017-03-30 22:32 GMT+02:00 Navina Ramesh <nramesh@linkedin.com.invalid>:
>>
>> > Hi Renato,
>> >
>> > > Having the big proposals documented on SEPs is really great to have a
>> > good understanding on the system!
>> > I agree. Our previous design process was not being strictly enforced. We
>> > hope to enforce it going forward as there are major changes coming into
>> the
>> > next release.
>> >
>> > > So this means that inside a container there will be a single processor?
>> > StreamProcessor is nothing more than a Samza container, along with an
>> > instance of JobCoordinator in it. Think about it as a thin-wrapper around
>> > SamzaContainer and JobCoordinator instance. You can find more details on
>> > this idea here - https://issues.apache.org/jira/browse/SAMZA-1063
>> > Going forward, we want a Samza job to consist of one or more
>> > StreamProcessors, instead of N SamzaContainers and 1 AppMaster.
>> >
>> > >  is this related to SAMZA-1080 somehow?
>> > Yep. SAMZA-1080 introduces StreamProcessor with an almost pass-through
>> > JobCoordinator. In fact, at LinkedIn, one of the teams is already using
>> > this API with the StandaloneJobCoordinator and delegating partition
>> > distribution to kafka high-level consumer (since systemconsumer is
>> > pluggable in Samza, we have some internal wrappers around high-level
>> > consumer). It has been working really well for stateless applications, I
>> > believe.
>> >
>> > Cheers!
>> > Navina
>> >
>> > On Thu, Mar 30, 2017 at 1:23 PM, Renato Marroquín Mogrovejo <
>> > renatoj.marroquin@gmail.com> wrote:
>> >
>> > > Hi Navina,
>> > >
>> > > Thanks for the great proposal! Having the big proposals documented on
>> > SEPs
>> > > is really great to have a good understanding on the system!
>> > > I have only a clarification question, the proposal states that every
>> > > containerId is the same as the processorId. So this means that inside a
>> > > container there will be a single processor? is this related to
>> SAMZA-1080
>> > > somehow?
>> > >
>> > >
>> > > Best,
>> > >
>> > > Renato M.
>> > >
>> > > 2017-03-30 20:45 GMT+02:00 Navina Ramesh <nramesh@linkedin.com.invalid
>> >:
>> > >
>> > > > Hi Yi,
>> > > > Good question. Three reasons:
>> > > >
>> > > > 1. In SAMZA-881, we came up with a set of responsibilities for the
>> > > > JobCoordinator. One of them was to generate/assign processorId. So,
>> it
>> > > > makes sense to keep getProcessorId() within JobCoordinator interface.
>> > > > 2. StreamProcessor was initially introduced as a user-facing API
>> > > > SAMZA-1080. ProcessorId was an argument in StreamProcessor
>> constructor.
>> > > It
>> > > > was pushing the burden of guaranteeing unique among the processors
>> of a
>> > > job
>> > > > to the user. This was not favorable.
>> > > > 3. In general, I think we have consensus that the
>> processorIdGenerator
>> > is
>> > > > going to specific to a runtime environment. Hence, it seems more
>> > > > appropriate to move it to a lower abstraction layer that deals with
>> the
>> > > > underlying execution environment.
>> > > >
>> > > > Let me know if you have a different perspective on this.
>> > > >
>> > > > Cheers!
>> > > > Navina
>> > > >
>> > > > On Thu, Mar 30, 2017 at 9:42 AM, Yi Pan <nickpan47@gmail.com>
wrote:
>> > > >
>> > > > > @Navina,
>> > > > >
>> > > > > Sorry to chime in late. One question:
>> > > > > 1. Why is it in JobCoordinator, and why not in StreamProcessor
>> class?
>> > > > > Because JobCoordinator provides coordination service across many
>> > > > > processors, an interface getProcessorId() in JobCoordinator is
>> > > confusing
>> > > > > regarding to which processorId we are getting.
>> > > > >
>> > > > > Otherwise, the proposal looks good.
>> > > > >
>> > > > > -Yi
>> > > > >
>> > > > > On Wed, Mar 29, 2017 at 7:57 PM, Navina Ramesh
>> > > > > <nramesh@linkedin.com.invalid
>> > > > > > wrote:
>> > > > >
>> > > > > > Good to hear from you, Yan. Thanks! :)
>> > > > > >
>> > > > > > On Wed, Mar 29, 2017 at 7:48 PM, Yan Fang <yanfang724@gmail.com>
>> > > > wrote:
>> > > > > >
>> > > > > > > +1 . Thanks for the proposal, Navina. :)
>> > > > > > >
>> > > > > > > Fang, Yan
>> > > > > > > yanfang724@gmail.com
>> > > > > > >
>> > > > > > > On Thu, Mar 30, 2017 at 4:24 AM, Prateek Maheshwari
<
>> > > > > > > pmaheshwari@linkedin.com.invalid> wrote:
>> > > > > > >
>> > > > > > > > +1 (non binding) from me.
>> > > > > > > >
>> > > > > > > > - Prateek
>> > > > > > > >
>> > > > > > > > On Tue, Mar 28, 2017 at 2:17 PM, Boris S <boryas@gmail.com>
>> > > wrote:
>> > > > > > > >
>> > > > > > > > > +1 Looks good to me.
>> > > > > > > > >
>> > > > > > > > > On Tue, Mar 28, 2017 at 2:00 PM, xinyu liu
<
>> > > > xinyuliu.us@gmail.com>
>> > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > +1 on my side. Very happy to see this
proposal. This is a
>> > > > blocker
>> > > > > > for
>> > > > > > > > > > integrating fluent API with StreamProcessor,
and
>> hopefully
>> > we
>> > > > can
>> > > > > > get
>> > > > > > > > it
>> > > > > > > > > > resolved soon :).
>> > > > > > > > > >
>> > > > > > > > > > Thanks,
>> > > > > > > > > > Xinyu
>> > > > > > > > > >
>> > > > > > > > > > On Tue, Mar 28, 2017 at 11:28 AM, Navina
Ramesh (Apache)
>> <
>> > > > > > > > > > navina@apache.org>
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi everyone,
>> > > > > > > > > > >
>> > > > > > > > > > > This is a voting thread for SEP-1:
Semantics of
>> > ProcessorId
>> > > > in
>> > > > > > > Samza.
>> > > > > > > > > > > For reference, here is the wiki
link:
>> > > > > > > > > > > https://cwiki.apache.org/confluence/display/SAMZA/SEP-
>> > > > > > > > > > > 1%3A+Semantics+of+ProcessorId+in+Samza
>> > > > > > > > > > >
>> > > > > > > > > > > Link to discussion mail thread:
>> > > > > > > > > > > http://mail-archives.apache.
>> > org/mod_mbox/samza-dev/201703.
>> > > > > > > > > > > mbox/%3CCANazzuuHiO%3DvZQyFbTiYU-0Sfh3riK%3Dz4j_
>> > > > > > > > > > AdCicQ8rBO%3DXuYQ%40mail.
>> > > > > > > > > > > gmail.com%3E
>> > > > > > > > > > >
>> > > > > > > > > > > Please vote on this SEP asap. :)
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks!
>> > > > > > > > > > > Navina
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Navina R.
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Navina R.
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Navina R.
>> >
>>
>
>
>
> --
> We are hiring in Streams Infra (Kafka/Samza/Datastream) !!



-- 
Nacho - Ignacio Solis - isolis@igso.net

Mime
View raw message