samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Pettitt <cpett...@linkedin.com.INVALID>
Subject Re: [DISCUSS] SEP-2: ApplicationRunner Design
Date Sat, 22 Apr 2017 00:00:08 GMT
I'm playing with ApplicationRunner, so I'll probably have more feedback.
For now, in addition to async run we also need async notification of
completion or failure. Also, ApplicationStatus should be able to give me
the cause of failure (e.g. via an Exception), not just a failure state.

On Thu, Apr 20, 2017 at 3:52 PM, Chris Pettitt <cpettitt@linkedin.com>
wrote:

> It might be worth taking a look at how Beam does test streams. The API is
> more powerful than just passing in a queue, e.g.:
>
> TestStream<String> source = TestStream.create(StringUtf8Coder.of())
>     .addElements(TimestampedValue.of("this", start))
>     .addElements(TimestampedValue.of("that", start))
>     .addElements(TimestampedValue.of("future", start.plus(Duration.standardMinutes(1))))
>     .advanceProcessingTime(Duration.standardMinutes(3))
>     .advanceWatermarkTo(start.plus(Duration.standardSeconds(30)))
>     .advanceWatermarkTo(start.plus(Duration.standardMinutes(1)))
>     .advanceWatermarkToInfinity();
>
> ---
>
> BTW, have we given up on the idea of a simpler input system, e.g. one that
> assumes all input messages are keyed? It seems it would be possible to
> support legacy "system streams" via an adapter that mapped K, V -> V' and
> could open the possibility of inputs in whatever for users want, e.g.
> (again from Beam):
>
> final Create.Values<String> values = Create.of("test", "one", "two", "three");
>
> final TextIO.Read.Bound from = TextIO.Read.from("src/main/resources/words.txt");
>
> final KafkaIO.Read<Integer, Long> reader = KafkaIO.<Integer, Long>read()
>
>         .withBootstrapServers("myServer1:9092,myServer2:9092")
>
>         .withTopics(topics)
>
>         .withConsumerFactoryFn(new ConsumerFactoryFn(
>
>             topics, 10, numElements, OffsetResetStrategy.EARLIEST))
>
>         .withKeyCoder(BigEndianIntegerCoder.of())
>
>         .withValueCoder(BigEndianLongCoder.of())
>
>         .withMaxNumRecords(numElements);
> Ideally, such a simple input system specification would be useable in production as well
as test. At that point I don't know if we need a separate TestApplicationRunner except perhaps
as a hint to what we've been calling an Environment?
>
> ---
>
> Aren't we supposed to be able to run applications without blocking (e.g.
> for embedded cases)? The API suggests that run is going to be a blocking
> call?
>
> - Chris
>
>
> On Thu, Apr 20, 2017 at 1:06 PM, Jacob Maes <jacob.maes@gmail.com> wrote:
>
>> Thanks for the SEP!
>>
>> +1 on introducing these new components
>> -1 on the current definition of their roles (see Design feedback below)
>>
>> *Design*
>>
>>    - If LocalJobRunner and RemoteJobRunner handle the different methods of
>>    launching a Job, what additional value do the different types of
>>    ApplicationRunner and RuntimeEnvironment provide? It seems like a red
>> flag
>>    that all 3 would need to change from environment to environment. It
>>    indicates that they don't have proper modularity. The
>> call-sequence-figures
>>    support this; LocalApplicationRunner and RemoteApplicationRunner make
>> the
>>    same calls and the diagram only varies after jobRunner.start()
>>    - As far as I can tell, the only difference between Local and Remote
>>    ApplicationRunner is that one is blocking and the other is
>> non-blocking. If
>>    that's all they're for then either the names should be changed to
>> reflect
>>    this, or they should be combined into one ApplicationRunner and just
>> expose
>>    separate methods for run() and runBlocking()
>>    - There isn't much detail on why the main() methods for Local/Remote
>>    have such different implementations, how they receive the Application
>>    (direct vs config), and concretely how the deployment scripts, if any,
>>    should interact with them.
>>
>>
>> *Style*
>>
>>    - nit: None of the 11 uses of the word "actual" in the doc are
>> *actually*
>>    needed. :-)
>>    - nit: Colors of the runtime blocks in the diagrams are unconventional
>>    and a little distracting. Reminds me of nai won bao. Now I'm hungry.
>> :-)
>>    - Prefer the name "ExecutionEnvironment" over "RuntimeEnvironment". The
>>    term "execution environment" is used
>>    - The code comparisons for the ApplicationRunners are not
>> apples-apples.
>>    The local runner example is an application that USES the local runner.
>> The
>>    remote runner example is the just the runner code itself. So, it's not
>>    readily apparent that we're comparing the main() methods and not the
>>    application itself.
>>
>>
>> On Mon, Apr 17, 2017 at 5:02 PM, Yi Pan <nickpan47@gmail.com> wrote:
>>
>> > Made some updates to clarify the role and functions of
>> RuntimeEnvironment
>> > in SEP-2.
>> >
>> > On Fri, Apr 14, 2017 at 9:30 AM, Yi Pan <nickpan47@gmail.com> wrote:
>> >
>> > > Hi, everyone,
>> > >
>> > > In light of new features such as fluent API and standalone that
>> introduce
>> > > new deployment / application launch models in Samza, I created a new
>> > SEP-2
>> > > to address the new use cases. SEP-2 link: https://cwiki.apache.
>> > > org/confluence/display/SAMZA/SEP-2%3A+ApplicationRunner+Design
>> > >
>> > > Please take a look and give feedbacks!
>> > >
>> > > Thanks!
>> > >
>> > > -Yi
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message