kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Friedman <...@flurry.com>
Subject Re: Consumer throughput imbalance
Date Sun, 25 Aug 2013 15:59:16 GMT
What if you don't know ahead of time how long a message will take to consume? 

-- 
Ian Friedman


On Sunday, August 25, 2013 at 10:45 AM, Neha Narkhede wrote:

> Making producer side partitioning depend on consumer behavior might not be
> such a good idea. If consumption is a bottleneck, changing producer side
> partitioning may not help. To relieve consumption bottleneck, you may need
> to increase the number of partitions for those topics and increase the
> number of consumer instances.
> 
> You mentioned that the consumers take longer to process certain kinds of
> messages. What you can do is place the messages that require slower
> processing in separate topics, so that you can scale the number of
> partitions and number of consumer instances, for those messages
> independently.
> 
> Thanks,
> Neha
> 
> 
> On Sat, Aug 24, 2013 at 9:57 AM, Ian Friedman <ian@flurry.com (mailto:ian@flurry.com)>
wrote:
> 
> > Hey guys! We recently deployed our kafka data pipeline application over
> > the weekend and it is working out quite well once we ironed out all the
> > issues. There is one behavior that we've noticed that is mildly troubling,
> > though not a deal breaker. We're using a single topic with many partitions
> > (1200 total) to load balance our 300 consumers, but what seems to happen is
> > that some partitions end up more backed up than others. This is probably
> > due more to the specifics of the application since some messages take much
> > longer than others to process.
> > 
> > I'm thinking that the random partitioning in the producer is unsuited to
> > our specific needs. One option I was considering was to write an alternate
> > partitioner that looks at the consumer offsets from zookeeper (as in the
> > ConsumerOffsetChecker) and probabilistically weights the partitions by
> > their lag. Does this sound like a good idea to anyone else? Is there a
> > better or preferably already built solution? If anyone has any ideas or
> > feedback I'd sincerely appreciate it.
> > 
> > Thanks so much in advance.
> > 
> > P.S. thanks especially to everyone who's answered my dumb questions on
> > this mailing list over the past few months, we couldn't have done it
> > without you!
> > 
> > --
> > Ian Friedman
> > 
> 
> 
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message