kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stevo Slavić <ssla...@gmail.com>
Subject Re: Reusable consumer across consumer groups
Date Sat, 14 Mar 2015 00:03:59 GMT
Sorry for late reply. Not sure what more details you need.
Here's example http://confluent.io/docs/current/kafka-rest/docs/intro.html
of exposing Kafka through remoting (http/rest) :-)
One can without looking into kafka rest proxy code see based on its
limitations that it's using HL consumer, with all its deficiencies.
E.g. commit "request *must* be made to the specific REST proxy instance
holding the consumer instance" (see
http://confluent.io/docs/current/kafka-rest/docs/api.html#post--consumers-%28string-group_name%29-instances-%28string-instance%29-offsets
). Also "because consumers are stateful, any consumer instances created
with the REST API are tied to a specific REST proxy instance", and
"consumers may not change the set of topics they are subscribed to once
they have started consuming messages" (see
http://confluent.io/docs/current/kafka-rest/docs/api.html#consumers )

One of the things making high level consumer objects heavy is that each one
starts many threads, so a limited number of HL consumer instances can be
created per node (before OOM is thrown, not because there's not enough
memory, but because there are too many threads started).

With 0.8.2.x not much has changed on ability to reuse HL consumer instances
to poll on behalf of different consumer groups, consumer instances are
stateful - most importantly offset and lock(s) that active consumer is
holding. Luckily, there's simple consumer API.

Kind regards,
Stevo Slavic.

On Thu, Oct 23, 2014 at 6:36 PM, Neha Narkhede <neha.narkhede@gmail.com>
wrote:

> I'm wondering how much of this can be done using careful system design vs
> building it within the consumer itself. You could distribute the several
> consumer instances across machines since it is built for distributed load
> balancing. That will sufficiently isolate the resources required to run the
> various consumers. But probably you have a specific use case in mind for
> running several consumer groups on the same machine. Would you mind giving
> more details?
>
> On Thu, Oct 23, 2014 at 12:55 AM, Stevo Slavić <sslavic@gmail.com> wrote:
>
> > Imagine exposing Kafka over various remoting protocols, where incoming
> > poll/read requests may come in concurrently for different consumer
> groups,
> > especially in a case with lots of different consumer groups.
> > If you create and destroy KafkaConsumer for each such request, response
> > times and throughput will be very low, and doing that is one of the ways
> to
> > reproduce https://issues.apache.org/jira/browse/KAFKA-1716
> >
> > It would be better if one could reuse a (pool of) Consumer instances, and
> > through a read operation parameter specify for which consumer group
> should
> > read be performed.
> >
> > Kind regards,
> > Stevo Slavic.
> >
> > On Tue, Oct 14, 2014 at 6:17 PM, Neha Narkhede <neha.narkhede@gmail.com>
> > wrote:
> >
> > > Stevo,
> > >
> > > The new consumer API is planned for 0.9, not 0.8.2. You can take a look
> > at
> > > a detailed javadoc here
> > > <
> > >
> >
> http://people.apache.org/~nehanarkhede/kafka-0.9-consumer-javadoc/doc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> > > >
> > > .
> > >
> > > Can you explain why you would like to poll messages across consumer
> > groups
> > > using just one instance?
> > >
> > > Thanks,
> > > Neha
> > >
> > > On Tue, Oct 14, 2014 at 1:03 AM, Stevo Slavić <sslavic@gmail.com>
> wrote:
> > >
> > > > Hello Apache Kafka community,
> > > >
> > > > Current (Kafka 0.8.1.1) high-level API's KafkaConsumer is not
> > lightweight
> > > > object, it's creation takes some time and resources, and it does not
> > seem
> > > > to be thread-safe. It's API also does not support reuse, for
> consuming
> > > > messages from different consumer groups.
> > > >
> > > > I see even in the coming (0.8.2) redesigned API it will not be
> possible
> > > to
> > > > reuse consumer instance to poll messages from different consumer
> > groups.
> > > >
> > > > Can something be done to support this?
> > > >
> > > > Would it help if there was consumer group as a separate entity from
> > > > consumer, for all the subscription management tasks?
> > > >
> > > > Kind regards,
> > > > Stevo Slavic
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message