kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avi Flax <avi.f...@parkassist.com>
Subject Re: Strategy for true random producer keying
Date Tue, 24 Jan 2017 19:51:05 GMT

> On Jan 24, 2017, at 14:17, Jon Yeargers <jon.yeargers@cedexis.com> wrote:
> 
> It may be picking a random partition but it sticks with it indefinitely
> despite there being a significant disparity in traffic.

Ah, I forgot to mention that IIRC the default Partitioner impl doesn’t choose a random partition
for each individual record; it IIRC chooses one randomly every ~10 minutes, and for that period
sends all records to that partition.

Sorry I don’t have a citation for this... I think it’s been mentioned before in this list
somewhere. (And of course it’s in the source code.)

> I need to break it
> up in some different fashion. Maybe just a hash of
> System.currentTimeMillis()?

You could probably just use the result of currentTimeMillis() as the key. However, I don’t
recommend using a synthetic key, because down the road other folks may end up thinking it
has semantic value. Rather, I recommend you either implement a custom impl of Partitioner,
or simple assign a random partition ID to each ProducerRecord as I described earlier.

> meant to say mod%partition count of System.currentTimeMillis()


Well, that’s actually the default partitioning algorithm, when a record has a key. So no
need to re-implement that; as I wrote above you could just use the current time as the key,
and that should yield the same behavior.

> is there any disadvantage to true random distribution of traffic for a topic?


Yes: you lose ordering. This may or may not matter for your application. It end ended up being
a major problem for my application, so I switched to an entirely different topic/partition
scheme in order to achieve my particular goals (isolating each customer’s data + evenly
parallelizing I/O limited processing while retaining a certain required ordering).

HTH!

————
Software Architect @ Park Assist » http://tech.parkassist.com/
Mime
View raw message