kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Exactly once semantics
Date Thu, 08 Dec 2011 18:03:42 GMT
Neha is right. It's possible to achieve exactly-once delivery even in high
level consumer. What you have to do is do make sure all consumed messages
are really consumed and then call commitOffset. When you call commitOffset,
all messages returned to the apps should have been fully consumed or put in
a safe place.

Thanks,

Jun

On Thu, Dec 8, 2011 at 9:52 AM, Neha Narkhede <neha.narkhede@gmail.com>wrote:

> Mark,
>
> >> Is that correct? Did you mean SimpleConsumer or HighLevelConsumer? What
> are the differences?
>
> The high level consumer check points the offsets in zookeeper, either
> periodically or based on an API call (look at commitOffsets()).
>
> If you want to checkpoint each and every message offset, exactly-once
> semantics will be expensive. But if you are willing to tolerate a small
> window of duplicates, you could buffer and write the offsets in batches.
> If you choose to do the former, commitOffsets() approach is expensive,
> since that can lead to too many writes on zookeeper. If you choose the
> later, it could be fine, and you can use the high level consumer itself.
>
> On the contrary, if your consumer is writing the messages to some database
> or persistent storage, you might be better off using SimpleConsumer. There
> was another discussion about making the offset storage of the high level
> consumer pluggable, but we don't have that feature yet.
>
> Thanks,
> Neha
>
>
> On Thu, Dec 8, 2011 at 9:32 AM, Jun Rao <junrao@gmail.com> wrote:
>
> > Currently, the high level consumer (with ZK integration) doesn't expose
> > offsets to the consumer. Only SimpleConsumer does.
> >
> > Jun
> >
> > On Thu, Dec 8, 2011 at 9:15 AM, Mark <static.void.dev@gmail.com> wrote:
> >
> > > "This is only possible through SimpleConsumer right now."
> > >
> > >
> > > Is that correct? Did you mean SimpleConsumer or HighLevelConsumer? What
> > > are the differences?
> > >
> > >
> > > On 12/8/11 8:53 AM, Jun Rao wrote:
> > >
> > >> Mark,
> > >>
> > >> Today, this is mostly the responsibility of the consumer, by managing
> > the
> > >> offsets properly. For example, if the consumer periodically flushes
> > >> messages to disk, it has to checkpoint to disk the offset
> corresponding
> > to
> > >> the last flush. On failure, the consumer has to rewind the consumption
> > >> from
> > >> the last checkpointed offset. This is only possible through
> > SimpleConsumer
> > >> right now.
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Thu, Dec 8, 2011 at 8:18 AM, Mark<static.void.dev@gmail.com**>
> >  wrote:
> > >>
> > >>  How can one guarantee exactly one semantics when using Kafka as a
> > >>> traditional queue? Is this guarantee the responsibility of the
> > consumer?
> > >>>
> > >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message