kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Насыров Ренат <renat-nasy...@yandex.ru>
Subject Rebalancing during the long-running tasks
Date Tue, 16 Feb 2016 15:15:41 GMT
Hello!

I'm trying to use kafka for long-running tasks processing. The tasks can be very short (less
than a second) or very long (about 10 minutes). I've got one consumer group for the single
queue, and one or more consumers. Sometimes consumers manage to commit their offsets before
rebalancing, sometimes not (and fail). Accordning to this document ( https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design ),
in worst case (when all the consumers are on very long tasks) it goes as follows:

1) Consumers get long tasks from the queue.
2) Consumers performing their long-running tasks.
3) Session timeout happens.
4) Group coordinator performs a rebalance; the current generation number is increased.
5) Consumers complete their long-running tasks and commit.
6) GroupCoordinator returns IllegalGeneration errors to consumers and does not allow to commit
the offsets.
7) Consumers reconnect and get the very messages from the previous generation, thus stucking
in forever loop.

Suggestions:

1) Commit first, then process. Inacceptable in my case because it leads to at-most-once semantics.
2) Increase session timeout limit. Not desired because task duration can negatively affect
the effectiveness of rebalance.

Is there any proper way to complete long-running tasks?

Mime
View raw message