kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clark Breyman <cl...@breyman.com>
Subject Re: High Level Consumer delivery semantics
Date Wed, 29 Jan 2014 20:31:54 GMT
Guozhang,

Thank make sense except for the following:

- the ZookeeperConsumerConnector.commitOffsets() method commits the current
value of PartitionTopicInfo.consumeOffset  for all of the active streams.

- the ConsumerIterator in the streams advances the value of
PartitionTopicInfo.consumeOffset *before* next() returns, not after the
processing on that message is complete.

If you have multiple threads consuming, thread A calling commitOffsets()
may commit thread B's retrieved but unprocessed message, no?


On Wed, Jan 29, 2014 at 12:20 PM, Guozhang Wang <wangguoz@gmail.com> wrote:

> Hi Clark,
>
> In practice, the client app code need to always commit offset after it has
> processed the messages, and hence only the second case may happen, leading
> to "at least once".
>
> Guozhang
>
>
> On Wed, Jan 29, 2014 at 11:51 AM, Clark Breyman <clark@breyman.com> wrote:
>
> > Wrestling through the at-least/most-once semantics of my application and
> I
> > was hoping for some confirmation of the semantics. I'm not sure I can
> > classify the high level consumer as either  type.
> >
> > False ack scenario:
> > - Thread A: call next() on the ConsumerIterator, advancing the
> > PartitionTopicInfo offset
> > - Thread B: commitOffsets() flushed offset of incomplete message to ZK
> > - Thread A: fail processing (e.g. kill -9)
> >
> > False retry scenario:
> > - Thread A: call next() & successfully process, kill -9 before
> > commitOffsets either in thread or in parallel.
> >
> > Is this right or am I missing something (likely)? Seems like the
> semantics
> > are essentially approximately once.
> >
>
>
>
> --
> -- Guozhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message