kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manikumar <manikumar.re...@gmail.com>
Subject Re: Kafka Client Consumes Last Committed Message On Restart
Date Wed, 20 Sep 2017 06:31:07 GMT

If you are using commitSync(Map<TopicPartition,OffsetAndMetadata> offsets)
api, then the committed offset
should be the next message your application will consume, i.e.
lastProcessedMessageOffset + 1.


On Wed, Sep 20, 2017 at 12:23 AM, Manan G <manan.gcs@gmail.com> wrote:

> Hello,
> I am using Kafka broker and Java client library v
> When I restart my Kafka consumer application which uses Java Kafka client
> library to retrieve messages, I notice that for each partition, the message
> associated with the last offset that was committed successfully gets
> re-consumed. I am "not" using auto-commit feature of the Java Kafka client
> library.
> So for example, for some topic partition 1 -
> 1. Consumer application commits offset 100 manually.
> 2. Consumer application gets restarted.
> 3. Consumer polls for messages from Kafka. This time, the first message
> polled from Kafka is the one with offset 100. Shouldn't the new batch of
> messages polled from Kafka start from offset 101?
> This causes duplication of messages - one for each partition on application
> restart. I understand that message duplication is possible in other
> scenarios too, but this particular behavior coming from the client library
> seemed strange. In my case, I am writing a framework for our use case, and
> we would like to avoid this specific message duplication scenario.
> Questions:
> * Wondering if there is any reason for this or am I missing something in my
> code.
> * If this was done by design, what would be easiest way for user
> application to circumvent this behavior?
> Thanks in advance.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message