kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip O'Toole <philip.oto...@yahoo.com.INVALID>
Subject Re: consumer rebalance weirdness
Date Thu, 07 Aug 2014 22:22:57 GMT
Turn on GC logging (verbose time stamps) and see how long your pauses are. 

Sure, try increasing the timeout to see if it fixes the problem, but I would hesitate making
that change permanent until you understand the problem better. 

You could also profile your consumer to see where it is spending its time.  Perhaps you can
make your message consumption quicker. I am sure the core commiters would also have some ideas.

Philip

----------------------------------
http://www.philipotoole.com

> On Aug 7, 2014, at 3:06 PM, Jason Rosenberg <jbr@squareup.com> wrote:
> 
> Yeah, it's possible that's happening (but no smoking gun).  The main thing I'm seeing
is that when it actually takes the time to process messages, it takes longer to get back to
the ConsumerIterator for the next message.  That alone seems to be the problem (does that
make any sense)?  I would have thought the zk listeners are in separate async threads (and
that's what it looks like looking at the kafka consumer code).
> 
> Maybe I should increase the zk session timeout and see if that helps.
> 
> 
>> On Thu, Aug 7, 2014 at 2:56 PM, Philip O'Toole <philip.otoole@yahoo.com.invalid>
wrote:
>> A big GC pause in your application, for example, could do it.
>> 
>> Philip
>> 
>>  
>> -----------------------------------------
>> http://www.philipotoole.com
>> 
>> 
>> On Thursday, August 7, 2014 11:56 AM, Philip O'Toole <philip.otoole@yahoo.com>
wrote:
>> 
>> 
>> 
>> I think the question is what in your consuming application could cause it not to
check in with ZK for longer than the timeout.
>> 
>>  
>> -----------------------------------------
>> http://www.philipotoole.com
>> 
>> 
>> On Thursday, August 7, 2014 8:16 AM, Jason Rosenberg <jbr@squareup.com> wrote:
>> 
>> 
>> 
>> Well, it's possible that when processing, it might take longer than the
>> zookeeper timeout to process a message, intermittently.  Would that cause a
>> zookeeper timeout?
>> 
>> (btw I'm usind 0.8.1.1).
>> 
>> 
>> 
>> On Thu, Aug 7, 2014 at 2:30 AM, Clark Haskins <chaskins@linkedin.com.invalid
>> > wrote:
>> 
>> > Is your application possibly timing out its zookeeper connection during
>> > consumption while doing its processing, thus triggering the rebalance?
>> >
>> > -Clark
>> >
>> > On 8/6/14, 11:18 PM, "Jason Rosenberg" <jbr@squareup.com> wrote:
>> >
>> > >We've noticed that some of our consumers are more likely to repeatedly
>> > >trigger rebalancing when the app is consuming messages more slowly (e.g.
>> > >persisting data to back-end systems, etc.).
>> > >
>> > >If on the other hand we 'fast-forward' the consumer (which essentially
>> > >means we tell it to consume but do nothing with the messages until all
>> > >caught up), it will never decide to do a rebalance during this time.  So
>> > >it
>> > >can go hours without rebalancing while fast forwarding and consuming super
>> > >fast, while during normal processing, it might decide to rebalance every
>> > >minute or
>>  so.
>> > >
>> > >Is there any simple explanation for this?
>> > >
>> > >Usually the trigger for rebalance logged is that a "topic info for path
X
>> > >has changed to Y, triggering rebalance".
>> > >
>> > >Thanks for any ideas.
>> > >
>> > >We'd like to reduce the rebalancing, as it essentially slows down
>> > >consumption each time it happens.
>> > >
>> > >Thanks
>> > >
>> > >Jason
>> >
>> >
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message