kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Daugherty <Drew.Daughe...@returnpath.com>
Subject Kafka Consumer Threads Stalled
Date Tue, 13 Aug 2013 23:57:15 GMT
Hi,

We are using zookeeper 3.3.6 with kafka 0.7.2. We have a topic with 8 partitions on each of
3 brokers that we are consuming with a consumer group with multiple threads.  We are using
the following settings for our consumers:
zk.connectiontimeout.ms=12000000
fetch_size=52428800
queuedchunks.max=6
consumer.timeout.ms=5000

Our brokers have the following configuration:
socket.send.buffer=1048576
socket.receive.buffer=1048576
max.socket.request.bytes=104857600
log.flush.interval=10000
log.default.flush.interval.ms=1000
log.default.flush.scheduler.interval.ms=1000
log.retention.hours=4
log.file.size=536870912
enable.zookeeper=true
zk.connectiontimeout.ms=6000
zk.sessiontimeout.ms=6000
max.message.size=52428800

We are noticing that after the consumer runs for a short while, some threads stop consuming
and start throwing the following timeout exceptions:
kafka.consumer.ConsumerTimeoutException
        at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:66)
        at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:32)
        at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:59)
        at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:51)

When this happens, message consumption on the affected partitions doesn't recover but stalls
and the consumer offset remains frozen.  The exceptions also continue to be thrown in the
logs as the thread logic logs the error then tries to create another iterator from the stream
and consume from it.  We also notice that consumption tends to freeze on 2/3 brokers but there
is one that always seems to keep the consumers fed.  Are there settings or logic we can use
to avoid or recover from such exceptions? 

-drew

Mime
View raw message