kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Kafka crashed after multiple topics were added
Date Wed, 14 Aug 2013 14:38:32 GMT
The first error is caused by too many open file handlers. Kafka keeps each
of the segment files open on the broker. So, the more topics/partitions you
have, the more file handlers you need. You probably need to increase the
open file handler limit and also monitor the # of open file handlers so
that you can get an alert when it gets close to the limit.

Not sure why you get the second error on restart. Are you using the 0.8
beta1 release?

Thanks,

Jun


On Tue, Aug 13, 2013 at 11:04 PM, Vadim Keylis <vkeylis2009@gmail.com>wrote:

> We have 3 node kafka cluster. I initially created 4 topics.
> I wrote small shell script to create 150 topics.
>
> TOPICS=$(< $1)
> for topic in $TOPICS
> do
>    echo "/usr/local/kafka/bin/kafka-create-topic.sh --replica 3 --topic
> $topic --zookeeper $2:2181/kafka --partition 36"
>    /usr/local/kafka/bin/kafka-create-topic.sh --replica 3 --topic $topic
> --zookeeper $2:2181/kafka --partition 36
> done
>
> 10 minutes later I see messages like this
> [2013-08-13 11:43:58,944] INFO [ReplicaFetcherManager on broker 7] Removing
> fetcher for partition [m3_registration,0]
> (kafka.server.ReplicaFetcherManager) followed by
> [2013-08-13 11:44:00,067] WARN [ReplicaFetcherThread-0-8], error for
> partition [m3_registration,22] to broker 8
> (kafka.server.ReplicaFetcherThread)
> kafka.common.NotLeaderForPartitionException
>
> Then a few minutes later followed by the following messages that
> overwhelmed logging system.
> [2013-08-13 11:46:35,916] ERROR error in loggedRunnable
> (kafka.utils.Utils$)
> java.io.FileNotFoundException:
> /home/kafka/data7/replication-offset-checkpoint.tmp (Too many open files)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
>
> I restarted the service after discovering the problem. After a few minutes
> attempting to recover kafka service crashed with the following error.
>
>  [2013-08-13 17:20:08,953] INFO [Log Manager on Broker 7] Loading log
> 'm3_registration-29' (kafka.log.LogManager)
> [2013-08-13 17:20:08,992] FATAL Fatal error during KafkaServerStable
> startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
> java.lang.IllegalStateException: Found log file with no corresponding index
> file.
>
> No activity on the cluster after topics were added.
> What could have cause the crash and trigger too many open files exception?
> What the best way to recover in order to restart kafka service(Not sure if
> delete topic command will work in this particular case as all 3 services
> would not start)?How to prevent in the future?
>
> Thanks so much in advance,
> Vadim
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message