kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Kafka Consumer Threads Stalled
Date Sat, 17 Aug 2013 04:36:09 GMT
It should be the # fetcher threads. Yes, this is the max memory usage. You
will only hit it if all partitions have fetch.size bytes to give. This
typically only happens when the consumer was stopped and restarted after
some time.

Thanks,

Jun


On Fri, Aug 16, 2013 at 9:36 AM, Drew Daugherty <
Drew.Daugherty@returnpath.com> wrote:

> Just to clarify, are the consumer threads you are referring to the number
> passed into the map along with the topic when instantiating the connector
> or is it the fetcher thread count?  This formula must specify a maximum
> memory usage and not a working usage or we would still be getting OOMEs.
>  Under what conditions would the max memory usage be reached?
>
> -drew
> ________________________________________
> From: Drew Daugherty [Drew.Daugherty@returnpath.com]
> Sent: Friday, August 16, 2013 9:12 AM
> To: users@kafka.apache.org
> Subject: RE: Kafka Consumer Threads Stalled
>
> Thanks Jun, that would explain why I was running out of memory.
>
> -drew
> ________________________________________
> From: Jun Rao [junrao@gmail.com]
> Sent: Friday, August 16, 2013 8:37 AM
> To: users@kafka.apache.org
> Subject: Re: Kafka Consumer Threads Stalled
>
> The more accurate formula is the following since fetch size is per
> partition.
>
> <consumer threads> * <queuedchunks.max> * <fetch size> * #partitions
>
> Thanks,
>
> Jun
>
>
> On Thu, Aug 15, 2013 at 9:40 PM, Drew Daugherty <
> Drew.Daugherty@returnpath.com> wrote:
>
> > Thank you Jun. It turned out an OOME was thrown in one of the consumer
> > fetcher threads.  Speaking of which, what is the best method for
> > determining the consumer memory usage?  I had read that the formula below
> > would suffice, but I am questioning it:
> >
> > <consumer threads> * <queuedchunks.max> * <fetch size>
> >
> > -drew
> > ________________________________________
> > From: Jun Rao [junrao@gmail.com]
> > Sent: Wednesday, August 14, 2013 8:42 AM
> > To: users@kafka.apache.org
> > Subject: Re: Kafka Consumer Threads Stalled
> >
> > In that case, have you looked at
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F
> > ?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Aug 14, 2013 at 7:38 AM, Drew Daugherty <
> > Drew.Daugherty@returnpath.com> wrote:
> >
> > > The problem is not the fact that the timeout exceptions are being
> thrown.
> > >  We have tried with and without the timeout setting and, in both cases,
> > we
> > > end up with threads that are stalled and not consuming data. Thus the
> > > problem is consumers that are registered and not consuming and no
> > > rebalancing is done  We suspected a problem with zookeeper but we have
> > run
> > > smoke and latency tests and got reasonable results.
> > >
> > > -drew
> > >
> > > Sent from Moxier Mail
> > > (http://www.moxier.com)
> > >
> > >
> > > ----- Original Message -----
> > > From: Jun Rao <junrao@gmail.com>
> > > To: "users@kafka.apache.org" <users@kafka.apache.org>
> > > Sent: 8/13/2013 10:17 PM
> > > Subject: Re: Kafka Consumer Threads Stalled
> > >
> > >
> > >
> > > If you don't want to see ConsumerTimeoutException, just set
> > > consumer.timeout.ms to -1. If you do need consumer.timeout.ms larger
> > than
> > > 0, make sure that on ConsumerTimeoutException,  your consumer thread
> > loops
> > > back and calls hasNext() on the iterator to resume the consumption.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Tue, Aug 13, 2013 at 4:57 PM, Drew Daugherty <
> > > Drew.Daugherty@returnpath.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > We are using zookeeper 3.3.6 with kafka 0.7.2. We have a topic with 8
> > > > partitions on each of 3 brokers that we are consuming with a consumer
> > > group
> > > > with multiple threads.  We are using the following settings for our
> > > > consumers:
> > > > zk.connectiontimeout.ms=12000000
> > > > fetch_size=52428800
> > > > queuedchunks.max=6
> > > > consumer.timeout.ms=5000
> > > >
> > > > Our brokers have the following configuration:
> > > > socket.send.buffer=1048576
> > > > socket.receive.buffer=1048576
> > > > max.socket.request.bytes=104857600
> > > > log.flush.interval=10000
> > > > log.default.flush.interval.ms=1000
> > > > log.default.flush.scheduler.interval.ms=1000
> > > > log.retention.hours=4
> > > > log.file.size=536870912
> > > > enable.zookeeper=true
> > > > zk.connectiontimeout.ms=6000
> > > > zk.sessiontimeout.ms=6000
> > > > max.message.size=52428800
> > > >
> > > > We are noticing that after the consumer runs for a short while, some
> > > > threads stop consuming and start throwing the following timeout
> > > exceptions:
> > > > kafka.consumer.ConsumerTimeoutException
> > > >         at
> > > > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:66)
> > > >         at
> > > > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:32)
> > > >         at
> > > >
> > kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:59)
> > > >         at
> > > kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:51)
> > > >
> > > > When this happens, message consumption on the affected partitions
> > doesn't
> > > > recover but stalls and the consumer offset remains frozen.  The
> > > exceptions
> > > > also continue to be thrown in the logs as the thread logic logs the
> > error
> > > > then tries to create another iterator from the stream and consume
> from
> > > it.
> > > >  We also notice that consumption tends to freeze on 2/3 brokers but
> > there
> > > > is one that always seems to keep the consumers fed.  Are there
> settings
> > > or
> > > > logic we can use to avoid or recover from such exceptions?
> > > >
> > > > -drew
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message