kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akshat Aranya <aara...@gmail.com>
Subject Re: Kafka - deployment size and topologies
Date Wed, 08 Apr 2015 16:29:22 GMT
Thanks for the info, Todd.  This is very useful.  Please see my question
inline:

On Mon, Apr 6, 2015 at 10:24 AM, Todd Palino <tpalino@gmail.com> wrote:

>
>     - Partition count (leader and follower combined) on each broker should
> stay under 4000
>
> As far as topic volume goes, it varies widely. We have topics that only see
> a single message per minute (or less). Our largest topic by bytes has a
> peak rate of about 290 Mbits/sec. Our largest topic by messages has a peak
> rate of about 225k messages/sec. Note that those are in the same cluster.
> When we are sizing topics (number of partitions), we use the following
> guidelines:
>     - Have at least as many partitions as there are consumers in the
> largest group
>     - Keep partition size on disk under 50GB per partition (better balance)
>     - Take into account any other application requirements (keyed messages,
> specific topic counts required, etc.)
>
>  What would you say is a recommended configuration when you don't have too
many topics?  It seems like having too many partitions is not recommended,
but at the same time, you need more partitions to be able to utilize all
the disks and handle the data rate, especially for high volume topics.

I hope this helps. I'll be covering some of this at my ApacheCon talk
> (Kafka at Scale: Multi-Tier Architectures) and at the meet up that Jun has
> set up at ApacheCon. If you have any questions, just ask!
>
> -Todd
>
>
> On Mon, Apr 6, 2015 at 9:35 AM, Rama Ramani <rama.ramani@live.com> wrote:
>
> > Hello,
> >           I am trying to understand some of the common Kafka deployment
> > sizes ("small", "medium", "large") and configuration to come up with a
> set
> > of common templates for deployment on Linux. Some of the Qs to answer
> are:
> >
> > - Number of nodes in the cluster
> > - Machine Specs (cpu, memory, number of disks, network etc.)
> > - Speeds & Feeds of messages
> > - What are some of the best practices to consider when laying out the
> > clusters?
> > -  Is there a sizing calculator for coming up with this?
> >
> > If you can please share pointers to existing materials or specific
> details
> > of your deployment, that will be great.
> >
> > Regards
> > Rama
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message