kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Curtin <curtin.ch...@gmail.com>
Subject Re: custom kafka consumer - strangeness
Date Thu, 09 Jan 2014 21:14:03 GMT
If you look at the example simple consumer:
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

You'll see:

  if (currentOffset < readOffset) {
        System.out.println("Found an old offset: " + currentOffset + "
Expecting: " + readOffset);
        continue;
    }

and a comment in the 'Reading the Data' part:

Also note that we are explicitly checking that the offset being read is not
less than the offset that we requested. This is needed since if Kafka is
compressing the messages, the fetch request will return an entire
compressed block even if the requested offset isn't the beginning of the
compressed block. Thus a message we saw previously may be returned again.

This is probably what is happening to you

Chris


On Thu, Jan 9, 2014 at 4:00 PM, Gerrit Jansen van Vuuren <
gerritjvv@gmail.com> wrote:

> Hi,
>
> I'm writing a custom consumer for kafka 0.8.
> Everything works except for the following:
>
> a. connect, send fetch, read all results
> b. send fetch
> c. send fetch
> d. send fetch
> e. via the console publisher, publish 2 messages
> f. send fetch :corr-id 1
> g. read 2 messages published :offsets [10 11] :corr-id 1
> h. send fetch :corr-id 2
> i. read 2 messages published :offsets [10 11] :corr-id 2
> j.  send fetch ...
>
> The problem is I get the messages sent twice as a response to two separate
> fetch requests. The correlation id is distinct so it cannot be that I read
> the response twice. The offsets of the 2 messages are are the same so they
> are duplicates, and its not the producer sending the messages twice.
>
> Note: the same connection is kept open the whole time, and I send
> block,receive then send again, after the first 2 messages are read, the
> offsets are incremented and the next fetch will ask kafka to give it
> messages from the new offsets.
>
> any ideas of why kafka would be sending the messages again on the second
> fetch request?
>
> Regards,
>  Gerrit
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message