kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Falko <afa...@salesforce.com>
Subject Kafka Replicated Partition Limits
Date Wed, 03 Jan 2018 21:48:25 GMT
Hi everyone,

We are seeing more and more push from our Kafka users to support well
more than 10k replicated partitions. We'd ideally like to avoid running multiple
clusters to keep our cluster management and monitoring simple. We started
testing kafka to see how many replicated partitions it could handle.

We found that, to maintain SLAs of under 50ms for produce latency,
Kafka starts going downhill at around 9k topics with 5 brokers. Each topic is
replicated 3x in our test. The bottleneck appears to be zookeeper:
after a certain
period of time, the number of outstanding requests in ZK spikes up at a
linear rate. Slowing down the rate at which we create and produce to topics,
improves things, but doing that makes the system tougher to manage and use.
We are happy to publish our detailed results with reproduction
steps if anyone is interested.

Has anyone overcome this problem and scaled beyond 9k replicated partitions?
Does anyone have zookeeper tuning suggestions? Is it even the bottleneck?

According to this we should have at most 300 3x replicated per broker:
https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
Is anyone doing work to have kafka support more than that?

Best regards,
Andrey Falko
Salesforce.com

Mime
View raw message