kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evan Huus <evan.h...@shopify.com>
Subject Re: Horizontally Scaling Kafka Consumers
Date Thu, 30 Apr 2015 14:16:34 GMT
On Thu, Apr 30, 2015 at 2:15 AM, Nimi Wariboko Jr <nimi@channelmeter.com>
wrote:

> My mistake, it seems the Java drivers are a lot more advanced than the
> Shopify's Kafka driver (or I am missing something) - and I haven't used
> Kafka before.
>
> With the Go driver - it seems you have to manage offsets and partitions
> within the application code, while in Scala driver it seems you have the
> option of simply subscribing to a topic, and someone else will manage that
> part.
>
> After digging around a bit more, I found there is another library -
> https://github.com/wvanbergen/kafka - that speaks the consumergroup API
> and
> accomplishes what I was looking for and I assume is implemented by keeping
> track of memberships w/ Zookeeper.
>

Yes. That library is built on top of Sarama (Shopify's Go kafka driver),
and it's on our roadmap to integrate it properly. As far as I know, this is
the only major area where Sarama is lagging behind the jvm client.


>
> Thank you for the information - it really helped clear up what I failing to
> understand with kafka.
>
> Nimi
>
> On Wed, Apr 29, 2015 at 10:10 PM, Joe Stein <joe.stein@stealth.ly> wrote:
>
> > You can do this with the existing Kafka Consumer
> >
> >
> https://github.com/apache/kafka/blob/0.8.2/core/src/main/scala/kafka/consumer/SimpleConsumer.scala#L106
> > and probably any other Kafka client too (maybe with minor/major rework
> > to-do the offset management).
> >
> > The new consumer approach is more transparent on "Subscribing To Specific
> > Partitions"
> >
> >
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L200-L234
> > .
> >
> > Here is a Docker file (** pull request pending **) for wrapping kafka
> > consumers (doesn't have to be the go client, need to abstract that out
> some
> > more after more testing)
> >
> >
> https://github.com/stealthly/go_kafka_client/blob/mesos-marathon/consumers/Dockerfile
> >
> >
> > Also a VM (** pull request pending **) to build container, push to local
> > docker repository and launch on Apache Mesos
> >
> >
> https://github.com/stealthly/go_kafka_client/tree/mesos-marathon/mesos/vagrant
> > as working example how-to-do.
> >
> > All of this could be done without the Docker container and still work on
> > Mesos ... or even without Mesos and on YARN.
> >
> > You might also want to checkout how Samza integrates with Execution
> > Frameworks
> >
> >
> http://samza.apache.org/learn/documentation/0.9/comparisons/introduction.html
> > which has a Mesos patch https://issues.apache.org/jira/browse/SAMZA-375
> > and
> > built in YARN support.
> >
> > ~ Joe Stein
> > - - - - - - - - - - - - - - - - -
> >
> >   http://www.stealth.ly
> > - - - - - - - - - - - - - - - - -
> >
> > On Wed, Apr 29, 2015 at 8:56 AM, David Corley <davidcorley@gmail.com>
> > wrote:
> >
> > > You're right Stevo, I should re-phrase to say that there can be no more
> > > _active_ consumers than there are partitions (within a single consumer
> > > group).
> > > I'm guessing that's what Nimi is alluding to asking, but perhaps he can
> > > elaborate on whether he's using consumer groups and/or whether the 100
> > > partitions are all for a single topic, or multiple topics.
> > >
> > > On 29 April 2015 at 13:38, Stevo Slavić <sslavic@gmail.com> wrote:
> > >
> > > > Please correct me if wrong, but I think it is really not hard
> > constraint
> > > > that one cannot have more consumers (from same group) than partitions
> > on
> > > > single topic - all the surplus consumers will not be assigned to
> > consume
> > > > any partition, but they can be there and as soon as one active
> consumer
> > > > from same group goes offline (its connection to ZK is dropped),
> > consumers
> > > > from the group will be rebalanced so one passively waiting consumer
> > will
> > > > become active.
> > > >
> > > > Kind regards,
> > > > Stevo Slavic.
> > > >
> > > > On Wed, Apr 29, 2015 at 2:25 PM, David Corley <davidcorley@gmail.com
> >
> > > > wrote:
> > > >
> > > > > If the 100 partitions are all for the same topic, you can have up
> to
> > > 100
> > > > > consumers working as part of a single consumer group for that
> topic.
> > > > > You cannot have more consumers than there are partitions within a
> > given
> > > > > consumer group.
> > > > >
> > > > > On 29 April 2015 at 08:41, Nimi Wariboko Jr <nimi@channelmeter.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I was wondering what options there are for horizontally scaling
> > kafka
> > > > > > consumers? Basically if I have 100 partitions and 10 consumers,
> and
> > > > want
> > > > > to
> > > > > > temporarily scale up to 50 consumers, what options do I have?
> > > > > >
> > > > > > So far I've thought of just simply tracking consumer membership
> > > somehow
> > > > > > (either through Raft or zookeeper's znodes) on the consumers.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message