kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Thacker <he...@bath.edu>
Subject Consumer position not returning correct offset - 0.10.0.1
Date Wed, 05 Sep 2018 07:28:08 GMT
Hi,

In testing - i’ve got a 5 node Kafka cluster with min.insync.replicas set to 4. The cluster
is currently running version 0.10.0.1. 

There’s an application that publishes to a topic - and on restart, it attempts to read the
full contents of the topic up until the high watermark before then publishing extra messages
to the topic. This particular topic routinely contains between 600,000 and 5 million messages
and consists of a single partition, as we rely on precise ordering. To determine the offset
for the high-water mark, we do something similar to the following:

// Do not use offset management
props.put(“enable.auto.commit”, false);
props.put(“auto.offset.reset”, “latest’);

// ...

consumer.assign(tpList);
consumer.poll(100);

long maxOffset = consumer.position(tp) - 1;

This has been working fine for months and then all of a sudden, we found that maxOffset was
returning an offset that could be 10s or even 1000s of offsets away from the real high watermark.
As far as I can tell, the cluster state seems to be OK - there are no offline partitions or
out of sync replicas when we see this issue. We also are not using transactional messages
in Kafka. 

We found that doing an explicit seekToEnd before checking the position seems to help, eg:

// ...

consumer.assign(tpList);
consumer.poll(100);

consumer.seekToEnd(tpList);
long maxOffset = consumer.position(tp) - 1;

But I can’t understand why this is necessary - when everything seems to have been working
prior to this? I’m now worried the cluster is in a bad state, and we’re not capturing
the health status correctly or missing some error messages somewhere. 

Keen for any ideas about what this might be or for things I can try.

Thanks in advance,
Henry









Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message