kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liam Clarke-Hutchinson <liam.cla...@adscale.co.nz>
Subject Re: Broker side partition round robin
Date Mon, 01 Jun 2020 21:43:49 GMT
Hi Vinicius,

As you note, the cluster doesn't load balance producers, it relies on them
using a partition strategy to do so.

In production, I've never had actual broker load skew develop from multiple
independent producers using round robining - and we're talking say 20 - 50
producers (depending on scaling) writing terabytes over a day.

And load skew / hot brokers is something I monitor closely.

The only time I've seen load skew is when a key based partition strategy
was used, and keys weren't evenly distributed.

So in other words, in theory there's no guarantee, but in my experience,
round robining multiple producers works fine.

Cheers,

Liam Clarke

On Mon, 1 Jun. 2020, 11:55 pm Vinicius Scheidegger, <
vinicius.scheidegger@gmail.com> wrote:

> Hey guys, I need some help here...
>
> Is this a flaw in the design (maybe a discussion point for a KIP?), is
> Kafka not supposed to perform equal load balancing with multiple producers
> or am I missing something (which is what I believe is happening)?
>
> On Wed, May 27, 2020 at 2:40 PM Vinicius Scheidegger <
> vinicius.scheidegger@gmail.com> wrote:
>
>> Does anyone know whether we could really have an "out of the box"
>> solution to do round robin over the partitions when we have multiple
>> producers?
>> By that I mean, a round robin on the broker side (or maybe some way to
>> synchronize all producers).
>>
>> Thank you,
>>
>> On Tue, May 26, 2020 at 1:41 PM Vinicius Scheidegger <
>> vinicius.scheidegger@gmail.com> wrote:
>>
>>> Yes, I checked it. The issue is that RoundRobbinPartitioner is bound to
>>> the producer. In a scenario with multiple producers it doesn't guarantee
>>> equal distribution - from what I understood and from my tests, the
>>> following situation happens with it:
>>>
>>> [image: image.png]
>>>
>>> Of course, the first partition is not always 1 and each producer may
>>> start in a different point in time, anyway my point is that it does not
>>> guarantee equal distribution.
>>>
>>> The other option pointed out is to select the partition myself - either
>>> a shared memory on the producers (assuming that this is possible - I mean I
>>> would need to guarantee that producers CAN share a synchronized memory) or
>>> include an intermediate topic with a single partition and a
>>> dispatcher/producer using RoundRobinPartitioner (but this would include a
>>> single point of failure).
>>>
>>> [image: image.png]
>>> [image: image.png]
>>>
>>> None of these seem to be ideal as a Broker side round robin solution
>>> would.
>>> Am I missing something? Any other ideas?
>>>
>>> Thanks
>>>
>>> On Tue, May 26, 2020 at 11:34 AM M. Manna <manmedia@gmail.com> wrote:
>>>
>>>> Hey Vinicius,
>>>>
>>>>
>>>> On Tue, 26 May 2020 at 10:27, Vinicius Scheidegger <
>>>> vinicius.scheidegger@gmail.com> wrote:
>>>>
>>>> > In a scenario with multiple independent producers (imagine ephemeral
>>>> > dockers, that do not know the state of each other), what should be the
>>>> > approach for the messages being sent to be equally distributed over
a
>>>> topic
>>>> > partition?
>>>> >
>>>> > From what I understood the partition election is always on the
>>>> Producer. Is
>>>> > this understanding correct?
>>>> >
>>>> > If that's the case, how should one achieve an equally distributed load
>>>> > balancing (round robin) over the partitions in a scenario with
>>>> multiple
>>>> > producers?
>>>> >
>>>> > Thank you,
>>>> >
>>>> > Vinicius Scheidegger
>>>>
>>>>
>>>>  Have you checked RoundRobinPartitioner ? Also, you can always specify
>>>> which partition you are writing to, so you can control the partitioning
>>>> in
>>>> your way.
>>>>
>>>> Regards,
>>>>
>>>> Regards,
>>>>
>>>> >
>>>> >
>>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message