kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip O'Toole <philip.oto...@yahoo.com.INVALID>
Subject Re: Issue with 240 topics per day
Date Tue, 12 Aug 2014 15:28:59 GMT
Todd -- can you share details of the ZK cluster you are running, to support this scale? Is
it one single Kafka cluster? Are you using 1 single ZK cluster?


Thanks,

Philip

 
-----------------------------------------
http://www.philipotoole.com 


On Monday, August 11, 2014 9:32 PM, Todd Palino <tpalino@linkedin.com.INVALID> wrote:
 


As I noted, we have a cluster right now with 70k partitions. It’s running
on over 30 brokers, partly to cover the number of partitions and and
partly to cover the amount of data that we push through it. If you can
have at least 4 or 5 brokers, I wouldn’t anticipate any problems with the
number of partitions. You may need more than that depending on the
throughput you want to handle.

-Todd


On 8/11/14, 9:20 PM, "Chen Wang" <chen.apache.solr@gmail.com> wrote:

>Todd,
>Yes I actually thought about that. My concern is that even a weeks topic
>partition(240*7*3 = 5040) is too many. Does linkedin have a good
>experience
>in using this many topics in your system?:-)
>Thanks,
>Chen
>
>
>On Mon, Aug 11, 2014 at 9:02 PM, Todd Palino
><tpalino@linkedin.com.invalid>
>wrote:
>
>> In order to delete topics, you need to shut down the entire cluster (all
>> brokers), delete the topics from Zookeeper, and delete the log files and
>> partition directory from the disk on the brokers. Then you can restart
>>the
>> cluster. Assuming that you can take a periodic outage on your cluster,
>>you
>> can do it this way.
>>
>> Reading what you’re intending to do in other parts of this thread, have
>> you considered setting up 1 week’s worth of topics with 3 day retention,
>> and having your producer and consumer rotate between them. That is, on
>> Sunday at 12:00 AM, you start with topic1, then proceed to topic2 at
>> 12:06, and so on. The next week, you loop around over exactly the same
>> topics, knowing that the retention settings have cleared out the old
>>data.
>>
>> -Todd
>>
>> On 8/11/14, 4:45 PM, "Chen Wang" <chen.apache.solr@gmail.com> wrote:
>>
>> >Todd,
>> >I actually only intend to keep each topic valid for 3 days most. Each
>>of
>> >our topic has 3 partitions, so its around 3*240*3 =2160 partitions.
>>Since
>> >there is no api for deleting topic, i guess i could set up a cron job
>> >deleting the out dated topics(folders) from zookeeper..
>> >do you know when the delete topic api will be available in kafka?
>> >Chen
>> >
>> >
>> >On Mon, Aug 11, 2014 at 3:47 PM, Todd Palino
>> ><tpalino@linkedin.com.invalid>
>> >wrote:
>> >
>> >> You need to consider your total partition count as you do this.
>>After 30
>> >> days, assuming 1 partition per topic, you have 7200 partitions.
>> >>Depending
>> >> on how many brokers you have, this can start to be a problem. We just
>> >> found an issue on one of our clusters that has over 70k partitions
>>that
>> >> there¹s now a problem with doing actions like a preferred replica
>> >>election
>> >> for all topics because the JSON object that gets written to the
>> >>zookeeper
>> >> node to trigger it is too large for Zookeeper¹s default 1 MB data
>>size.
>> >>
>> >> You also need to think about the number of open file handles. Even
>>with
>> >>no
>> >> data, there will be open files for each topic.
>> >>
>> >> -Todd
>> >>
>> >>
>> >> On 8/11/14, 2:19 PM, "Chen Wang" <chen.apache.solr@gmail.com> wrote:
>> >>
>> >> >Folks,
>> >> >Is there any potential issue with creating 240 topics every day?
>> >>Although
>> >> >the retention of each topic is set to be 2 days, I am a little
>> >>concerned
>> >> >that since right now there is no delete topic api, the zookeepers
>> >>might be
>> >> >overloaded.
>> >> >Thanks,
>> >> >Chen
>> >>
>> >>
>>
>>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message