kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bhavesh Mistry <mistry.p.bhav...@gmail.com>
Subject Topic Partitioning Strategy For Large Data
Date Fri, 23 May 2014 19:49:39 GMT
Hi Kafka Users,

We are trying to transport 4TB data per day on single topic.  It is
operation application logs.    How do we estimate number of partitions and
partitioning strategy?   Our goal is to drain (from consumer side) from
the Kafka Brokers as soon as messages arrive (keep the lag as minimum as
possible) and also we would like to uniformly distribute the logs across
all partitions.

Here is our Brokers HW Spec:

3 Broker Cluster (192 GB RAM, 32 Cores each with SSD to hold 7 days of data
) with 100G NIC

Data Rate :    ~ 13 GB per minute

Is there a formula to compute optimal number of partition need  ?  Also,  how
to ensure uniform distribution from the producer side  (currently we have
counter % numPartitions  which is not viable solution in prod env)


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message