kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Cheng <jch...@tivo.com>
Subject How to manage the consumer group id?
Date Wed, 10 Jun 2015 20:20:37 GMT

How are people specifying/persisting/resetting the consumer group identifier ("group.id")
when using the high-level consumer?

I understand how it works. I specify some string and all consumers that use that same string
will help consume a topic. The partitions will be distributed amongst them for consumption.
And when they save their offsets, the offsets will be saved according to the consumer group.
That all makes sense to me.

What I don't understand is the best way to set and persist them, and reset them if needed.
For example, do I simply hardcode the string in my code? If so, then all deployed instances
will have the same value (that's good). If I want to bring up a test instance of that code,
or a new installation, though, then it will also share the load (that's bad). 

If I pass in a value to my instances, that lets me have different test and production instances
of the same code (that's good), but then I have to persist my consumer group id somewhere
outside of the process (on disk, in zookeeper, etc). Which then means I need some way to manage
*that* identifier (that's... just how it is?).

What if I decide that I want my app to start over? In the case of log-compacted streams, I
want to throw away any processing I did and start "from the beginning". Do I change my consumer
group, which effective resets everything? Or do I delete my saved offsets, and then resume
with the same consumer group? The latter is functionally equivalent to the former.


View raw message