kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Curtin <curtin.ch...@gmail.com>
Subject Re: one consumerConnector or many?
Date Wed, 29 May 2013 14:52:35 GMT
That's a good question about # of sockets when a single consumer is
connecting. I'll let someone from LinkedIn comment if each consumer has a
socket per topic/partition or if it is per Broker, since I'm not familiar
with that part of the code.

On Wed, May 29, 2013 at 9:53 AM, Withers, Robert <Robert.Withers@dish.com>wrote:

> Thanks for the info.  Are you saying that even with a single connector,
> with say 3 topics and 3 threads per topic and 3 brokers with 3 partitions
> for all 3 topics on all 3 brokers, that a consumer box would have 9 sockets
> open?  What if there are 6 partitions per topic, would that be 18 open
> sockets?
>
> I have read somewhere that a high partition number, per topic, is
> desirable, to scale out the consumers and to be prepared to dynamically
> scale out consumption during a traffic spike.  Is it so?  100 topics, with
> 16 brokers and 200 partitions per topic with 1 consumer connector (just
> hypothetically so) would be 1600 sockets or 20000 sockets?
>
> For sure these boxes have plenty of ports.  I am just thinking through
> possible failures and what flexibility we have in configuration of
> producers/consumers to topics.  Really the question is best practices in
> this area.  A producer server handling 100+ msg types could also connect
> quite a bit.  So, perhaps it is best to restrict producer and consumer
> servers to process a restricted "class" of types.  Certainly if the
> producer is also hosting a web server, but perhaps not as dire on the
> consumer side.
>
> thanks,
> rob
> ________________________________________
> From: Chris Curtin [curtin.chris@gmail.com]
> Sent: Wednesday, May 29, 2013 7:36 AM
> To: users
> Subject: Re: one consumerConnector or many?
>
> I'd look at a variation of #2. Can your messages by grouped into a 'class
> (for lack of a better term)' that are consumed together? For example a
> 'class' of 'auditing events' or 'sensor events'. The idea would to then
> have a topic for 'class'.
>
> A couple of benefits to this:
> - you can define your consumption of a 'class's resources by value. So the
> 'audit' topic may only get a 2 threaded consumer while the 'sensor' class
> gets a 10 threaded consumer.
> - you can stop processing a 'class' of messages if you need to without
> taking all the consumers off line (Assuming you have different processors
> or a way while running to alter your number of threads per topic.)
>
> Since it sounds like you may be frequently adding new message types this
> approach also allows you to decide if you want to shutdown only a part of
> your processing to add the new code to handle the message.
>
> Finally, why the concern about socket use? A well configured Windows or
> Linux machine can have thousands of open sockets without problems. Since
> 0.8.0 only connects to the Broker with the topic/partition you end up with
> 1 socket per topic/partition and consumer.
>
> Hope this helps,
>
> Chris
>
>
> On Wed, May 29, 2013 at 9:13 AM, Rob Withers <reefedjib@gmail.com> wrote:
>
> > In thinking about the design of consumption, we have in mind a generic
> > consumer server which would consume from more than one message type.  The
> > handling of each type of message would be different.  I suppose we could
> > have upwards of say 50 different message types, eventually, maybe 100+
> > different types.  Which of the following designs would be best and why
> > would
> > the other options be bad?
> >
> >
> >
> > 1)      Have all message types go through one topic and use a dispatcher
> > pattern to select the correct handler.  Use one consumerConnector.
> >
> > 2)      Use a different topic for each message type, but still use one
> > consumerConnector and a dispatcher pattern.
> >
> > 3)      Use a different topic for each message type and have a separate
> > consumerConnector for each topic.
> >
> >
> >
> > I am struggling with whether my assumptions are correct.  It seems that a
> > single connector for a topic would establish one socket to each broker,
> as
> > rebalancing assigns various partitions to that thread.  Option 2 would
> pull
> > messages from more than one topic through a single socket to a particular
> > broker, is it so?  Would option 3 be reasonable, establishing upwards of
> > 100
> > sockets per broker?
> >
> >
> >
> > I am guestimating that option 2 is the right way forward, to bound socket
> > use, and we'll need to figure out a way to parameterize stream
> consumption
> > with the right handlers for a particular msg type.  If we add a topic, do
> > you think we should create a new connector or restart the original
> > connector
> > with the new topic in the map?
> >
> >
> >
> > Thanks,
> >
> > rob
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message