kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Kafka Consumer Threads Stalled
Date Wed, 14 Aug 2013 14:42:09 GMT
In that case, have you looked at
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F?

Thanks,

Jun


On Wed, Aug 14, 2013 at 7:38 AM, Drew Daugherty <
Drew.Daugherty@returnpath.com> wrote:

> The problem is not the fact that the timeout exceptions are being thrown.
>  We have tried with and without the timeout setting and, in both cases, we
> end up with threads that are stalled and not consuming data. Thus the
> problem is consumers that are registered and not consuming and no
> rebalancing is done  We suspected a problem with zookeeper but we have run
> smoke and latency tests and got reasonable results.
>
> -drew
>
> Sent from Moxier Mail
> (http://www.moxier.com)
>
>
> ----- Original Message -----
> From: Jun Rao <junrao@gmail.com>
> To: "users@kafka.apache.org" <users@kafka.apache.org>
> Sent: 8/13/2013 10:17 PM
> Subject: Re: Kafka Consumer Threads Stalled
>
>
>
> If you don't want to see ConsumerTimeoutException, just set
> consumer.timeout.ms to -1. If you do need consumer.timeout.ms larger than
> 0, make sure that on ConsumerTimeoutException,  your consumer thread loops
> back and calls hasNext() on the iterator to resume the consumption.
>
> Thanks,
>
> Jun
>
>
> On Tue, Aug 13, 2013 at 4:57 PM, Drew Daugherty <
> Drew.Daugherty@returnpath.com> wrote:
>
> > Hi,
> >
> > We are using zookeeper 3.3.6 with kafka 0.7.2. We have a topic with 8
> > partitions on each of 3 brokers that we are consuming with a consumer
> group
> > with multiple threads.  We are using the following settings for our
> > consumers:
> > zk.connectiontimeout.ms=12000000
> > fetch_size=52428800
> > queuedchunks.max=6
> > consumer.timeout.ms=5000
> >
> > Our brokers have the following configuration:
> > socket.send.buffer=1048576
> > socket.receive.buffer=1048576
> > max.socket.request.bytes=104857600
> > log.flush.interval=10000
> > log.default.flush.interval.ms=1000
> > log.default.flush.scheduler.interval.ms=1000
> > log.retention.hours=4
> > log.file.size=536870912
> > enable.zookeeper=true
> > zk.connectiontimeout.ms=6000
> > zk.sessiontimeout.ms=6000
> > max.message.size=52428800
> >
> > We are noticing that after the consumer runs for a short while, some
> > threads stop consuming and start throwing the following timeout
> exceptions:
> > kafka.consumer.ConsumerTimeoutException
> >         at
> > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:66)
> >         at
> > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:32)
> >         at
> > kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:59)
> >         at
> kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:51)
> >
> > When this happens, message consumption on the affected partitions doesn't
> > recover but stalls and the consumer offset remains frozen.  The
> exceptions
> > also continue to be thrown in the logs as the thread logic logs the error
> > then tries to create another iterator from the stream and consume from
> it.
> >  We also notice that consumption tends to freeze on 2/3 brokers but there
> > is one that always seems to keep the consumers fed.  Are there settings
> or
> > logic we can use to avoid or recover from such exceptions?
> >
> > -drew
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message