kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <joe.st...@stealth.ly>
Subject Re: Uniform Distribution of Messages for Topic Across Partitions Without Effecting Performance
Date Tue, 05 Aug 2014 01:50:42 GMT
Bhavesh, take a look at
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
?

Maybe the root cause issue is something else? Even if producers produce
more or less than what they are producing you should be able to make it
random enough with a partitioner and a key.  I don't think you should need
more than what is in the FAQ but incase so maybe look into
http://en.wikipedia.org/wiki/MurmurHash as another hash option.

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/


On Mon, Aug 4, 2014 at 9:12 PM, Bhavesh Mistry <mistry.p.bhavesh@gmail.com>
wrote:

> How to achieve uniform distribution of non-keyed messages per topic across
> all partitions?
>
> We have tried to do this uniform distribution across partition using custom
> partitioning from each producer instance using round robing (
> count(messages) % number of partition for topic). This strategy results in
> very poor performance.  So we have switched back to random stickiness that
> Kafka provide out of box per some interval ( 10 minutes not sure exactly )
> per topic.
>
> The above strategy results in consumer side lags sometime for some
> partitions because we have some applications/producers  producing more
> messages for same topic than other servers.
>
> Can Kafka provide out of box uniform distribution by using coordination
> among all producers and rely on measure rate such as  # messages per minute
> or # of bytes produce per minute to achieve uniform distribution and
> coordinate stickiness of partition among hundreds of producers for same
> topic ?
>
> Thanks,
>
> Bhavesh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message