kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lorenzo Alberton <l.alber...@gmail.com>
Subject Re: Thousands of topics
Date Tue, 07 Aug 2012 21:47:34 GMT
Hi Taylor,

thanks for your reply. I'd love to read your blog post about your
experiences with it, especially around hardware configuration and how you
consume the data (few/many short/long-lived processes, average throughput
per topic). The cleanup script seems really useful too, I was considering
writing one that also cleans dead topics off zookeeper.

Thanks!

Lorenzo


On Tue, Jul 31, 2012 at 8:58 PM, Taylor Gautier <tgautier@tagged.com> wrote:

> Yes, we have done so at Tagged.  I chronicled a bit of our experience here
> on the the mailing list.  Effectively we found that a single machine could
> not go above ~20k total topics.  This could be OS dependent however (we use
> CentOS 5.x)
>
> Various tweaks we made to go further:
>
>    1. a beefed up node.js kafka client/producer implementation -
>    https://github.com/tagged/node-kafka lies at the heart of our kafka
>    deployment
>    2. our own kafka software load balancer (implemented using said library)
>    that shards out independent Kafka instances (guarantees in-order
> delivery
>    per topic and scales the # of kafka topics linearly as a function of
> the #
>    of kafka machines)
>    3. a continuous cleaner that removes old dead topics completely from the
>    filesystem (0.7 cleaner leaves empty directory/file which eats up open
> file
>    handles and limits max # of topics)
>    4. (coming soon) a hierarchical topic directory structure to ease the
>    pain of too main directories/files in a single directory (should help
> the
>    ~20k number, though probably by less than you might imagine)
>
> On our todo list is blogging about this in more detail, and contributing
> back more than just the node.js implementation.
>
> On Mon, Jul 30, 2012 at 8:39 AM, Lorenzo Alberton <l.alberton@gmail.com
> >wrote:
>
> > Is there anyone who tried Kafka with thousands of concurrent topics?
> > If so, what are your experiences? How did you tune it?
> >
> > Thanks!
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message