samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Audo <na...@naudo.de>
Subject Re: At-least once processing guarantee
Date Thu, 25 Jul 2019 15:50:42 GMT
Hello Jagadish,
Is the rejected alternatives document meant to be private (link at the
bottom of the document)?

On Wed, Jul 24, 2019 at 11:46 AM Jagadish Venkatraman <jagadish@apache.org>
wrote:

> Dear Samza users,
>
> We recently discovered an issue with the way we handle state in Samza Beam
> and Samza High-Level API Window operators. Under certain situations, at
> least once processing guarantee is violated.
>
>
> *Details on the issue*
>
> The Samza high-level API includes operators such as windows which can hold
> messages and emit them at a later time. eg: a time based window will buffer
> its messages and emit results only at periodic time intervals. Here’s the
> sequence of operations when a window operator is ready to emit results:
>
>    1.
>
>    Obtain the results ready to be emitted for the window
>    2.
>
>    Remove the operator state corresponding to those results from its
>    state-store
>    3.
>
>    Propagate the window results to down-stream operators in the pipeline.
>    The results eventually makes it to a terminal operator, which emits the
>    final output.
>    4.
>
>    At some future point, issue a commit operation, which flushes the
>    producers, the state stores and persists the input offsets.
>
>
> Scenarios
>
>    -
>
>    An exception in a downstream operator in step 3.
>    -
>
>    An unclean shutdown, e.g., due to a “kill -9” to the container before
>    the outputs have been flushed in step 4
>
>
> Both these examples violate Samza’s at-least once processing guarantee,
> since they cause state to be modified even though the processed outputs may
> not have been emitted.
>
>
> *Solution*
>
> Here's a proposal
> <
> https://docs.google.com/document/d/1wtSAUMrns14cWCf5Wdf4YwXd7K5Hf-bm7W3SQ3sXZsQ/edit
> >
> to fix the above issue. We are working on addressing this with the highest
> priority.
>
>
> Thanks,
>
> Jagadish
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message