kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Consumer re-balance behavior
Date Mon, 24 Oct 2011 14:31:26 GMT
During rebalance, we simply sort all partitions and consumers by name and
give each consumer an even range of partitions. Since partitions on the same
broker sort together, they tend to be given out to the same consumer, as in
this case.

Since partition is the unit of rebalance, you want to have at least as many
partitions as consumers. This is the main reason to have more than 1
partition per broker.

Number of partitions is controlled by 2 config parameters: num.partitions
and topic.partition.count.map. The former is the default for all topics and
the latter is for specific topics.

Jun

On Mon, Oct 24, 2011 at 1:28 AM, Inder Pall <inder.pall@gmail.com> wrote:

> All,
>
> need some clarity and confirmation on the following behavior.
>
> Use-Case
> ------------
> 1. I have a topic T spread across two brokers (B1, B2)running on different
> machines, each having 2 partitions configured for T. Totally 4 partitions
> (1-0, 1-1, 2-0, 2-1)
> 2. Consumer C1 is part of group g1 and is consuming from from B1, B2 for T
> 3. Add a new consumer C2 part of g1
>
> This is triggering a re balance across C1 & C2 and eventually C1 gets 1-0,
> 1-1 and C2 gets 2-0, 2-1.
> P.S. - B1, C1 are sharing the same machine, same is the case with B2,C2
>
> Behavior
> ---------
> both consumers are getting partitions which are hosted on the same boxes.
> Is
> this a coincidence or an optimization w.r.t locality of data and will
> always
> be applied.
>
> More questions
> -----------------
> 1. When would you want to have multiple partitions of the same topic hosted
> on the same broker. Is it that you have 2 partitions of T on B1 and 10 on
> B2
> and on re balance C1 & C2 would get 6 each.
> 2.  As in the above use-case, C1 has 1-0 & 1-1 partitions of T and adding
> messages to B1 results in the messages being spread across both the
> partitions. Is this behavior round robin or based on segment file
> size/other
> parameters?
> 3. Is it possible to configure #partitons based on topic, if so how?
>
> -- Inder
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message