kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Black...@b3k.us>
Subject Re: Trade-off between topics and partitions?
Date Fri, 06 Dec 2013 05:32:24 GMT
Deja vu!

IMO, what you are describing is a database problem, even though you are
talking/thinking about it as a queue problem. I'm sure you could construct
something using Kafka (and Samza), but I think you'd have an easier time
with a database. The number of pending messages per user and the average
message size would be critical in selecting exactly which sort of database
to use.

My $0.02.

On Thu, Dec 5, 2013 at 7:47 PM, mission mission <mission0638@gmail.com>wrote:

> Hello,
> According to the Kafka FAQ "How do I choose the number of partitions for a
> topic", clusters with more than 10K partitions are not tested. I am looking
> for advice on how to scale the number of partitions beyond that. My use
> case is to publish messages to 1 million users, each with an unique user
> id. Users are not always connected but a user must receive published
> messages in order.
> What is the best way to divide topics and partitions for this use case? Do
> I need 1 million partitions? The FAQ seems to think so, i.e. "if we were
> storing notifications for users we would encourage a design with a single
> notifications topic partitioned by user id". But the FAQ implies strongly
> that 1 million partitions may wreak havoc on zookeeper because they will
> lead to X million znodes that have to be stored in memory. Any suggestions?
> Thanks,
> mission

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message