kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Wang <chen.apache.s...@gmail.com>
Subject Re: error recovery in multiple thread reading from Kafka with HighLevel api
Date Fri, 08 Aug 2014 00:36:15 GMT
Guozhang,
Just to make it clear:
If I have 10 threads with the same consumer group id, read the topic T. The
auto commit is turned off, and commitOffset is called only when the message
is processed successfully.
If thread 1 dies when processing message from partition P1, and the last
offset is Offset1.   Then kafka will ensure that one of the other running 9
threads will automatically pick up the message on partition P1 from Offset1
? will the thread have the risk of reading the same message more than once?

Also I would assume commit offset for each message is a bit heavy. What you
guys usually do for error handling during reading kafka?
Thanks much!
Chen



On Thu, Aug 7, 2014 at 5:18 PM, Guozhang Wang <wangguoz@gmail.com> wrote:

> Yes, in that case you can turn of auto commit and call commitOffsets
> manually after processing is finished. commitOffsets() will only write the
> offset of the partitions that the consumer is currently fetching, so there
> is no need to coordinate this operation.
>
>
> On Thu, Aug 7, 2014 at 5:03 PM, Chen Wang <chen.apache.solr@gmail.com>
> wrote:
>
> > But with the auto commit turned on, I am risking off losing the failed
> > message, right? should I turn off the auto commit, and only commit the
> > offset when the message is processed successfully..But that would require
> > the coordination between threads in order to know what is the right
> timing
> > to commit offset..
> >
> >
> >
> > On Thu, Aug 7, 2014 at 4:54 PM, Guozhang Wang <wangguoz@gmail.com>
> wrote:
> >
> > > Hello Chen,
> > >
> > > With high-level consumer, the partition re-assignment is automatic upon
> > > consumer failures.
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Aug 7, 2014 at 4:41 PM, Chen Wang <chen.apache.solr@gmail.com>
> > > wrote:
> > >
> > > > Folks,
> > > >  I have a process started at specific time and read from a specific
> > > topic.
> > > > I am currently using the High Level API(consumer group) to read from
> > > > kafka(and will stop once there is nothing in the topic by specifying
> a
> > > > timeout). i am most concerned about error recovery in multiple thread
> > > > context. If one thread dies, will other running bolt threads picks up
> > the
> > > > failed message? Or I have to start another thread in order to pick up
> > the
> > > > failed message? What would be  a good practice to ensure the message
> > can
> > > be
> > > > processed at least once?
> > > >
> > > > Note that all threads are using the same group id.
> > > >
> > > > Thanks,
> > > > Chen
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message