kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JIEFU GONG <jg...@berkeley.edu>
Subject Kafka rebalancing and general questions
Date Wed, 24 Jun 2015 18:37:29 GMT
Hi all,

I've very recently become interested in Kafka and have studied it a great
deal but find myself confused on some of the more basic design aspects of
the technology and despite considering various online resources still
haven't found the answer I was looking for -- hoping this community could
give me a helping hand!

- Can someone explain rebalancing to me as if I were a 5 year old? I
understand Kafka triggers a rebalance when a consumer enters an existing
consumer group and then the load of reading from partitions is balanced
among the new consumers (let me know if I am wrong on this) -- what I don't
understand is how it works and why there are claims online that the latency
from rebalancing can take anywhere from 5-15s
            - Specifically in the context of my own experiments, if i have
a topic with a single partition and a consumer from a group reading from
it, why is it that if i add another consumer to that same group reading
from the same topic my own machines trigger a rebalance that takes maybe
8-10 seconds before it resolves and i can consume the published messages?
Shouldn't this process be not too slow since in this context in the end
only one of the two consumer instances in that group going to be reading
from that single partition anyway?
            - In another experiment i performed I launched a consumer ready
to consume from some topic 'test'. i launched the producer shortly after
with the topic 'test' in 2 partitions and noticed that my single consumer
began to receive messages appropriately. then i introduced another consumer
to that same group (basically same context as above) but i noticed that
while the producer kept running, the original single consumer had stopped
consuming messages, and when the rebalancing was finally complete, it
seemed like both of the consumer instances lurched ahead and then displayed
each 10-15 messages that I thought were lost. Does anyone know what might
have happened there?

- Lastly, I was just wondering if it was safe to use the trunk
KafkaConsumer (this is for v0.9, correct?) I've been running my experiments
using that, which I'm not sure is safe or viable? Thanks for any help!


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message