kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradeep Bhattiprolu <pbhatt...@gmail.com>
Subject Re: Kafka Newbie question
Date Wed, 13 Apr 2016 05:10:14 GMT
I tried both the approaches stated above, with no luck :(.
Let me give concrete examples of what i am trying to achieve here :

1) Kafka Producer adds multiple JSON messages to a particular topic in the
message broker (done, this part works)
2) I want to have multiple consumers identified under a single consumer
group to read these messages and perform an index into a search engine. To
 have multiple consumers, I have created a process which spawns multiple
java threads.
3) While creating these threads and adding them to the pool, I select one
thread as a "leader" thread. The only difference between a leader thread
and other threads is that the "leader" thread sets the offset to beginning
by calling Consumer.seekToBeginning() (with no parameters).
4) Other threads pick out messages based on the last committed offsets of
the "leader" thread and other consumer threads.

The idea is to ensure that every thread group of consumers reading a single
topic should always read from the start of the topic. This is a fair
assumption to make within my application that each consumer group should
read from the start.

PS: I am using the new KafkaConsumer class and the latest 0.9.0.0 version
of the kafka-clients dependency in my project.

Any help / code samples to help me move forward are highly appreciated.

Thanks
Pradeep

On Sun, Apr 10, 2016 at 2:43 AM, Liquan Pei <liquanpei@gmail.com> wrote:

> Hi Predeep,
>
> I think I misinterpreted your question. Are you trying to consume a topic
> multiple times for each consumer instance or consume one topic with
> multiple consumer instances?
>
> In case that you want to consume a topic multiple times with one consumer
> instance, `seekToBeginning`` will reset the offset to the beginning in the
> next ``poll``.
>
> In case that you want each thread to consume the same topic multiple times,
> you need to use multiple consumer groups. Otherwise, only one consumer
> instance will be will be consuming the topic.
>
> Thanks,
> Liquan
>
>
>
> On Sun, Apr 10, 2016 at 12:21 AM, Harsha <kafka@harsha.io> wrote:
>
> > Pradeep,
> >             How about
> >
> >
> https://kafka.apache.org/090/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#seekToBeginning%28org.apache.kafka.common.TopicPartition...%29
> >
> > -Harsha
> >
> > On Sat, Apr 9, 2016, at 09:48 PM, Pradeep Bhattiprolu wrote:
> > > Liquan , thanks for the response.
> > > By setting the auto commit to false do i have to manage queue offset
> > > manually ?
> > > I am running a multiple threads with each thread being a consumer, it
> > > would
> > > be complicated to manage offsets across threads, if i dont use kafka's
> > > automatic consumer group abstraction.
> > >
> > > Thanks
> > > Pradeep
> > >
> > > On Sat, Apr 9, 2016 at 3:12 AM, Liquan Pei <liquanpei@gmail.com>
> wrote:
> > >
> > > > Hi Pradeep,
> > > >
> > > > Can you try to set enable.auto.commit = false if you want to read to
> > the
> > > > earliest offset? According to the documentation, auto.offset.reset
> > controls
> > > > what to do when there is no initial offset in Kafka or if the current
> > > > offset does not exist any more on the server (e.g. because that data
> > has
> > > > been deleted). In case that auto commit is enabled, the committed
> > offset is
> > > > available in some servers.
> > > >
> > > > Thanks,
> > > > Liquan
> > > >
> > > > On Fri, Apr 8, 2016 at 10:44 PM, Pradeep Bhattiprolu <
> > pbhattip9@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi All
> > > > >
> > > > > I am a newbie to kafka. I am using the new Consumer API in a thread
> > > > acting
> > > > > as a consumer for a topic in Kafka.
> > > > > For my testing and other purposes I have read the queue multiple
> > times
> > > > > using console-consumer.sh script of kafka.
> > > > >
> > > > > To start reading the message from the beginning in my java code ,
I
> > have
> > > > > set the value of the auto.offset.reset to "earliest".
> > > > >
> > > > > However that property does not guarantee that i begin reading the
> > > > messages
> > > > > from start, it goes by the most recent smallest offset for the
> > consumer
> > > > > group.
> > > > >
> > > > > Here is my question,
> > > > > Is there a assured way of starting to read the messages from
> > beginning
> > > > from
> > > > > Java based Kafka Consumer ?
> > > > > Once I reset one of my consumers to zero, do i have to do offset
> > > > management
> > > > > myself for other consumer threads or does kafka automatically lower
> > the
> > > > > offset to the first threads read offset ?
> > > > >
> > > > > Any information / material pointing to the solution are highly
> > > > appreciated.
> > > > >
> > > > > Thanks
> > > > > Pradeep
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Liquan Pei
> > > > Software Engineer, Confluent Inc
> > > >
> >
>
>
>
> --
> Liquan Pei
> Software Engineer, Confluent Inc
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message