kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weide Zhang <weo...@gmail.com>
Subject Re: kafka consumer fail over
Date Mon, 04 Aug 2014 15:28:09 GMT
Thanks a lot Guozhang and Daniel.

Weide


On Mon, Aug 4, 2014 at 8:27 AM, Guozhang Wang <wangguoz@gmail.com> wrote:

> Weide,
>
> Like Daniel said, the rebalance logic is deterministic as round robin, so
> if you have a total number of partitions as n, and each one (master or
> slave) machine also has n threads, then all partitions will go to master.
> When master fails and restarts, the partitions will automatically go back
> to the old master.
>
>
> On Sun, Aug 3, 2014 at 6:25 AM, Daniel Compton <desk@danielcompton.net>
> wrote:
>
> > Hi Weide
> >
> > The consumer rebalancing algorithm is deterministic. In your failure
> > scenario, when A comes back up again, the consumer threads will
> rebalance.
> > This will give you the initial consumer configuration at the start of the
> > test.
> >
> > I'm unsure whether the partitions are balanced round robin, or if they
> > will all go to A, then the overflow to B.
> >
> > If all of the messages need to be processed by a single machine, an
> > alternative architecture would be to have a standby server that waits
> until
> > master A fails and then connects as a consumer. This could be
> accomplished
> > by watching Zookeeper and getting a notification when A's ephemeral node
> is
> > removed.
> >
> > The high level consumer does seem to be the way to go as long as your
> > application can handle duplicate processing.
> >
> > Daniel.
> >
> > > On 2/08/2014, at 1:38 pm, Weide Zhang <weoccc@gmail.com> wrote:
> > >
> > > Hi Guozhang,
> > >
> > > If I use high level consumer, how do I ensure all data goes to master
> > even
> > > if slave was up and running ? Is it just by forcing master to have
> enough
> > > consumer thread to cover maximum number of partitions of a topic  since
> > > high level consumer doesn't have assumption of consumers who are master
> > and
> > > consumers who are slave.
> > >
> > > For example, master A initiate enough thread such that it can cover all
> > the
> > > partitions. slave B is standby with same consumer group and same number
> > of
> > > threads but since master A has enough thread to cover all the
> partitions.
> > > Slave B won't get any data.
> > >
> > > Suddenly master A goes down, slave B becomes new master, and it start
> to
> > > get data based on high level consumer rebalance design.
> > >
> > > After that old master A comes up and becomes slave, will A get data ?
> >  Or A
> > > will not get data because B has enough thread to cover all partitions
> in
> > > the rebalancing logic.
> > >
> > > Thanks,
> > >
> > > Weide
> > >
> > >
> > >> On Fri, Aug 1, 2014 at 4:45 PM, Guozhang Wang <wangguoz@gmail.com>
> > wrote:
> > >>
> > >> Hello Weide,
> > >>
> > >> That should be doable via high-level consumer, you can take a look at
> > this
> > >> page:
> > >>
> > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
> > >>
> > >> Guozhang
> > >>
> > >>
> > >>> On Fri, Aug 1, 2014 at 3:20 PM, Weide Zhang <weoccc@gmail.com>
> wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I have a use case for a master slave  cluster where the logic inside
> > >> master
> > >>> need to consume data from kafka and publish some aggregated data to
> > kafka
> > >>> again. When master dies, slave need to take the latest committed
> offset
> > >>> from master and continue consuming the data from kafka and doing the
> > >> push.
> > >>>
> > >>> My questions is what will be easiest kafka consumer design for this
> > >>> scenario to work ? I was thinking about using simpleconsumer and
> doing
> > >>> manual consumer offset syncing between master and slave. That seems
> to
> > >>> solve the problem but I was wondering if it can be achieved by using
> > high
> > >>> level consumer client ?
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Weide
> > >>
> > >>
> > >>
> > >> --
> > >> -- Guozhang
> > >>
> >
>
>
>
> --
> -- Guozhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message