samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Pettitt <cpett...@linkedin.com.INVALID>
Subject Re: [DISCUSS] SEP-2: ApplicationRunner Design
Date Thu, 20 Apr 2017 19:52:30 GMT
It might be worth taking a look at how Beam does test streams. The API is
more powerful than just passing in a queue, e.g.:

TestStream<String> source = TestStream.create(StringUtf8Coder.of())
    .addElements(TimestampedValue.of("this", start))
    .addElements(TimestampedValue.of("that", start))
    .addElements(TimestampedValue.of("future",
start.plus(Duration.standardMinutes(1))))
    .advanceProcessingTime(Duration.standardMinutes(3))
    .advanceWatermarkTo(start.plus(Duration.standardSeconds(30)))
    .advanceWatermarkTo(start.plus(Duration.standardMinutes(1)))
    .advanceWatermarkToInfinity();

---

BTW, have we given up on the idea of a simpler input system, e.g. one that
assumes all input messages are keyed? It seems it would be possible to
support legacy "system streams" via an adapter that mapped K, V -> V' and
could open the possibility of inputs in whatever for users want, e.g.
(again from Beam):

final Create.Values<String> values = Create.of("test", "one", "two", "three");

final TextIO.Read.Bound from = TextIO.Read.from("src/main/resources/words.txt");

final KafkaIO.Read<Integer, Long> reader = KafkaIO.<Integer, Long>read()

        .withBootstrapServers("myServer1:9092,myServer2:9092")

        .withTopics(topics)

        .withConsumerFactoryFn(new ConsumerFactoryFn(

            topics, 10, numElements, OffsetResetStrategy.EARLIEST))

        .withKeyCoder(BigEndianIntegerCoder.of())

        .withValueCoder(BigEndianLongCoder.of())

        .withMaxNumRecords(numElements);
Ideally, such a simple input system specification would be useable in
production as well as test. At that point I don't know if we need a
separate TestApplicationRunner except perhaps as a hint to what we've
been calling an Environment?

---

Aren't we supposed to be able to run applications without blocking (e.g.
for embedded cases)? The API suggests that run is going to be a blocking
call?

- Chris


On Thu, Apr 20, 2017 at 1:06 PM, Jacob Maes <jacob.maes@gmail.com> wrote:

> Thanks for the SEP!
>
> +1 on introducing these new components
> -1 on the current definition of their roles (see Design feedback below)
>
> *Design*
>
>    - If LocalJobRunner and RemoteJobRunner handle the different methods of
>    launching a Job, what additional value do the different types of
>    ApplicationRunner and RuntimeEnvironment provide? It seems like a red
> flag
>    that all 3 would need to change from environment to environment. It
>    indicates that they don't have proper modularity. The
> call-sequence-figures
>    support this; LocalApplicationRunner and RemoteApplicationRunner make
> the
>    same calls and the diagram only varies after jobRunner.start()
>    - As far as I can tell, the only difference between Local and Remote
>    ApplicationRunner is that one is blocking and the other is
> non-blocking. If
>    that's all they're for then either the names should be changed to
> reflect
>    this, or they should be combined into one ApplicationRunner and just
> expose
>    separate methods for run() and runBlocking()
>    - There isn't much detail on why the main() methods for Local/Remote
>    have such different implementations, how they receive the Application
>    (direct vs config), and concretely how the deployment scripts, if any,
>    should interact with them.
>
>
> *Style*
>
>    - nit: None of the 11 uses of the word "actual" in the doc are
> *actually*
>    needed. :-)
>    - nit: Colors of the runtime blocks in the diagrams are unconventional
>    and a little distracting. Reminds me of nai won bao. Now I'm hungry. :-)
>    - Prefer the name "ExecutionEnvironment" over "RuntimeEnvironment". The
>    term "execution environment" is used
>    - The code comparisons for the ApplicationRunners are not apples-apples.
>    The local runner example is an application that USES the local runner.
> The
>    remote runner example is the just the runner code itself. So, it's not
>    readily apparent that we're comparing the main() methods and not the
>    application itself.
>
>
> On Mon, Apr 17, 2017 at 5:02 PM, Yi Pan <nickpan47@gmail.com> wrote:
>
> > Made some updates to clarify the role and functions of RuntimeEnvironment
> > in SEP-2.
> >
> > On Fri, Apr 14, 2017 at 9:30 AM, Yi Pan <nickpan47@gmail.com> wrote:
> >
> > > Hi, everyone,
> > >
> > > In light of new features such as fluent API and standalone that
> introduce
> > > new deployment / application launch models in Samza, I created a new
> > SEP-2
> > > to address the new use cases. SEP-2 link: https://cwiki.apache.
> > > org/confluence/display/SAMZA/SEP-2%3A+ApplicationRunner+Design
> > >
> > > Please take a look and give feedbacks!
> > >
> > > Thanks!
> > >
> > > -Yi
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message