kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Exactly-once semantics with compression
Date Fri, 01 Jun 2012 04:51:04 GMT

With compression enabled, it's a bit hard to implement exact-once since
offsets are only advanced after a compressed batch of messages has been
consumed. So, you will have to make sure that each batch of messages can be
consumed together as a unit. The other option is to compress with a batch
size of 1.



On Thu, May 31, 2012 at 8:05 PM, Ross Black <ross.w.black@gmail.com> wrote:

> Hi,
> Using SimpleConsumer, I get the offset of a message (from MessageAndOffset)
> and persist it with my consumer data to get exactly-once semantics for
> consumer state (as described in the kafka design docs).  If the consumer
> fails then it is simply a matter of starting replay of messages from the
> persisted index.
> When using compression, the offset from MessageAndOffset appears to be the
> offset of the compressed batch.  e.g. For a batch of 10 messages, the
> offset returned for messages 1-9 is the start of the *current* batch, and
> the offset for message 10 is the start of the *next* batch.
> How can I get the exactly-once semantics for consumer state?
> Is there a way that I can get a batch of messages from SimpleConsumer?
> (otherwise I have to reconstruct a batch by watching for a change in the
> offset between messages)
> Thanks,
> Ross

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message