kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: one consumerConnector or many?
Date Thu, 30 May 2013 03:35:51 GMT
If # of connections is not an issue, option 3 is fine too.

Thanks,

Jun


On Wed, May 29, 2013 at 9:09 AM, Withers, Robert <Robert.Withers@dish.com>wrote:

> Thanks, Jun.  We have considered doing message filtering in the consumer.
>  However, the thrust of my question below is not filtering, but
> dispatching.  If we take Chris' recommendation and pump a small set of msg
> types, belonging to the same "class" of messages, such as Account History,
> through the same topic, we will want to process all the messages, but we
> will want to process each msg type within the "class" differently, so we
> will want to dispatch to different handlers.
>
> I totally see your point that if we only want to process a subset of the
> messages, then we really ought to filter in the producer and send the
> filtered message stream to its own topic.
>
> I am leaning toward the architecture of having a different
> consumerConnector per topic, as there ARE plenty of ports.  This allows per
> topic control, which is useful.  Do you see any issues with this approach?
>
> Thanks,
> rob
>
>
> -----Original Message-----
> From: Jun Rao [mailto:junrao@gmail.com]
> Sent: Wednesday, May 29, 2013 9:58 AM
> To: users@kafka.apache.org
> Subject: Re: one consumerConnector or many?
>
> Rob,
>
> You are correct that each instance of consumer will use a single socket to
> connect to a broker, independent of # topics/partitions. One thing that's
> good to avoid is to read all data and filter in the consumer, especially
> when the data is consumed multiple times by different consumers. In this
> case, it's better to put the filtered data in a separate topic and let all
> consumers consume the filtered data directly.
>
> Thanks,
>
> Jun
>
>
>
>
> On Wed, May 29, 2013 at 6:13 AM, Rob Withers <reefedjib@gmail.com> wrote:
>
> > In thinking about the design of consumption, we have in mind a generic
> > consumer server which would consume from more than one message type.
> > The handling of each type of message would be different.  I suppose we
> > could have upwards of say 50 different message types, eventually,
> > maybe 100+ different types.  Which of the following designs would be
> > best and why would the other options be bad?
> >
> >
> >
> > 1)      Have all message types go through one topic and use a dispatcher
> > pattern to select the correct handler.  Use one consumerConnector.
> >
> > 2)      Use a different topic for each message type, but still use one
> > consumerConnector and a dispatcher pattern.
> >
> > 3)      Use a different topic for each message type and have a separate
> > consumerConnector for each topic.
> >
> >
> >
> > I am struggling with whether my assumptions are correct.  It seems
> > that a single connector for a topic would establish one socket to each
> > broker, as rebalancing assigns various partitions to that thread.
> > Option 2 would pull messages from more than one topic through a single
> > socket to a particular broker, is it so?  Would option 3 be
> > reasonable, establishing upwards of
> > 100
> > sockets per broker?
> >
> >
> >
> > I am guestimating that option 2 is the right way forward, to bound
> > socket use, and we'll need to figure out a way to parameterize stream
> > consumption with the right handlers for a particular msg type.  If we
> > add a topic, do you think we should create a new connector or restart
> > the original connector with the new topic in the map?
> >
> >
> >
> > Thanks,
> >
> > rob
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message