kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Guy <damian....@gmail.com>
Subject Re: understanding consumer rebalance trigger(s)
Date Fri, 03 Mar 2017 08:39:45 GMT
Hi Jon,

On 0.10.0.1 it means that your processing is taking longer than the
configured session.timeout.ms.

"The timeout used to detect failures when using Kafka's " +
        "group management facilities. When a consumer's heartbeat is
not received within the session timeout, " +
        "the broker will mark the consumer as failed and rebalance the
group. Since heartbeats are sent only " +
        "when poll() is invoked, a higher session timeout allows more
time for message processing in the consumer's " +
        "poll loop at the cost of a longer time to detect hard
failures. See also <code>" + MAX_POLL_RECORDS_CONFIG + "</code> for "
+
        "another option to control the processing time in the poll
loop. Note that the value must be in the " +
        "allowable range as configured in the broker configuration by
<code>group.min.session.timeout.ms</code> " +
        "and <code>group.max.session.timeout.ms</code>.";


You might want to either try increasing the session.timeout.ms.
Alternatively you can decrease max.poll.records so there is less work to do
on each poll().

Thanks,
Damian

On Fri, 3 Mar 2017 at 01:52 Jon Yeargers <jon.yeargers@cedexis.com> wrote:

> Im wondering what the parameters are to instantiate a consumer rebalance. I
> have a topic that turns roughly 50K / minute across 6 partitions. Each is
> serviced by a separate dockerized consumer.
>
> Roughly every 8-12 min this goes into a rebalance that may take up to a
> minute. When it returns it often puts some or all partitions on to a single
> consumer (leaving others idle). This may persist for a minute while it
> tries another arrangement. Eventually after 2-3 tries it will evenly
> distribute the partitions.. for a few minutes until it does another
> misguided attempt. As a result we have lag increasing from 0 to ~450K and
> back to 0 on a cycle.
>
> The data rate is assumed to be roughly consistent through these cycles.
>
> Resultant graph of lag is a sawtooth shape.
>
> Using 0.10.0.1
> 3 brokers
>
>
> Also - is there some way to set / control consumer 'assignment'? Or to
> 'suggest' a setting?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message