kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Palino <tpal...@gmail.com>
Subject Re: Monitoring offset lag
Date Wed, 06 Jul 2016 17:08:16 GMT
Yeah, I've written dissertations at this point on why MaxLag is flawed. We
also used to use the offset checker tool, and later something similar that
was a little easier to slot into our monitoring systems. Problems with all
of these is why I wrote Burrow (https://github.com/linkedin/Burrow)

For more details, you can also check out my blog post on the release:


On Wednesday, July 6, 2016, Tom Dearman <tom.dearman@gmail.com> wrote:

> I recently had a problem on my production which I believe was a
> manifestation of the issue kafka-2978 (Topic partition is not sometimes
> consumed after rebalancing of consumer group), this is fixed in and
> we will upgrade our client soon.  However, it made me realise that I didn’t
> have any monitoring set up on this.  The only thing I can find as a metric
> is the
> kafka.consumer:type=ConsumerFetcherManager,name=MaxLag,clientId=([-.\w]+),
> which, if I understand correctly, is the max lag of any partition that that
> particular consumer is consuming.
> 1. If I had been monitoring this, and if my consumer was suffering from
> the issue in kafka-2978, would I actually have been alerted, i.e. since the
> consumer would think it is consuming correctly would it not have updated
> the metric.
> 2. There is another way to see offset lag using the command
> /usr/bin/kafka-consumer-groups --new-consumer --bootstrap-server
> --describe —group consumer_group_name and parsing the
> response.  Is it safe or advisable to do this?  I like the fact that it
> tells me each partition lag, although it is also not available if no
> consumer from the group is currently consuming.
> 3. Is there a better way of doing this?

*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message