kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Wang <chen.apache.s...@gmail.com>
Subject Re: Issue with 240 topics per day
Date Tue, 12 Aug 2014 04:39:46 GMT
Got it. thanks for the input Todd!
Chen


On Mon, Aug 11, 2014 at 9:31 PM, Todd Palino <tpalino@linkedin.com.invalid>
wrote:

> As I noted, we have a cluster right now with 70k partitions. It’s running
> on over 30 brokers, partly to cover the number of partitions and and
> partly to cover the amount of data that we push through it. If you can
> have at least 4 or 5 brokers, I wouldn’t anticipate any problems with the
> number of partitions. You may need more than that depending on the
> throughput you want to handle.
>
> -Todd
>
> On 8/11/14, 9:20 PM, "Chen Wang" <chen.apache.solr@gmail.com> wrote:
>
> >Todd,
> >Yes I actually thought about that. My concern is that even a weeks topic
> >partition(240*7*3 = 5040) is too many. Does linkedin have a good
> >experience
> >in using this many topics in your system?:-)
> >Thanks,
> >Chen
> >
> >
> >On Mon, Aug 11, 2014 at 9:02 PM, Todd Palino
> ><tpalino@linkedin.com.invalid>
> >wrote:
> >
> >> In order to delete topics, you need to shut down the entire cluster (all
> >> brokers), delete the topics from Zookeeper, and delete the log files and
> >> partition directory from the disk on the brokers. Then you can restart
> >>the
> >> cluster. Assuming that you can take a periodic outage on your cluster,
> >>you
> >> can do it this way.
> >>
> >> Reading what you’re intending to do in other parts of this thread, have
> >> you considered setting up 1 week’s worth of topics with 3 day retention,
> >> and having your producer and consumer rotate between them. That is, on
> >> Sunday at 12:00 AM, you start with topic1, then proceed to topic2 at
> >> 12:06, and so on. The next week, you loop around over exactly the same
> >> topics, knowing that the retention settings have cleared out the old
> >>data.
> >>
> >> -Todd
> >>
> >> On 8/11/14, 4:45 PM, "Chen Wang" <chen.apache.solr@gmail.com> wrote:
> >>
> >> >Todd,
> >> >I actually only intend to keep each topic valid for 3 days most. Each
> >>of
> >> >our topic has 3 partitions, so its around 3*240*3 =2160 partitions.
> >>Since
> >> >there is no api for deleting topic, i guess i could set up a cron job
> >> >deleting the out dated topics(folders) from zookeeper..
> >> >do you know when the delete topic api will be available in kafka?
> >> >Chen
> >> >
> >> >
> >> >On Mon, Aug 11, 2014 at 3:47 PM, Todd Palino
> >> ><tpalino@linkedin.com.invalid>
> >> >wrote:
> >> >
> >> >> You need to consider your total partition count as you do this.
> >>After 30
> >> >> days, assuming 1 partition per topic, you have 7200 partitions.
> >> >>Depending
> >> >> on how many brokers you have, this can start to be a problem. We just
> >> >> found an issue on one of our clusters that has over 70k partitions
> >>that
> >> >> there¹s now a problem with doing actions like a preferred replica
> >> >>election
> >> >> for all topics because the JSON object that gets written to the
> >> >>zookeeper
> >> >> node to trigger it is too large for Zookeeper¹s default 1 MB data
> >>size.
> >> >>
> >> >> You also need to think about the number of open file handles. Even
> >>with
> >> >>no
> >> >> data, there will be open files for each topic.
> >> >>
> >> >> -Todd
> >> >>
> >> >>
> >> >> On 8/11/14, 2:19 PM, "Chen Wang" <chen.apache.solr@gmail.com>
wrote:
> >> >>
> >> >> >Folks,
> >> >> >Is there any potential issue with creating 240 topics every day?
> >> >>Although
> >> >> >the retention of each topic is set to be 2 days, I am a little
> >> >>concerned
> >> >> >that since right now there is no delete topic api, the zookeepers
> >> >>might be
> >> >> >overloaded.
> >> >> >Thanks,
> >> >> >Chen
> >> >>
> >> >>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message