kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: kafka consumer fail over
Date Mon, 04 Aug 2014 15:27:02 GMT
Weide,

Like Daniel said, the rebalance logic is deterministic as round robin, so
if you have a total number of partitions as n, and each one (master or
slave) machine also has n threads, then all partitions will go to master.
When master fails and restarts, the partitions will automatically go back
to the old master.


On Sun, Aug 3, 2014 at 6:25 AM, Daniel Compton <desk@danielcompton.net>
wrote:

> Hi Weide
>
> The consumer rebalancing algorithm is deterministic. In your failure
> scenario, when A comes back up again, the consumer threads will rebalance.
> This will give you the initial consumer configuration at the start of the
> test.
>
> I'm unsure whether the partitions are balanced round robin, or if they
> will all go to A, then the overflow to B.
>
> If all of the messages need to be processed by a single machine, an
> alternative architecture would be to have a standby server that waits until
> master A fails and then connects as a consumer. This could be accomplished
> by watching Zookeeper and getting a notification when A's ephemeral node is
> removed.
>
> The high level consumer does seem to be the way to go as long as your
> application can handle duplicate processing.
>
> Daniel.
>
> > On 2/08/2014, at 1:38 pm, Weide Zhang <weoccc@gmail.com> wrote:
> >
> > Hi Guozhang,
> >
> > If I use high level consumer, how do I ensure all data goes to master
> even
> > if slave was up and running ? Is it just by forcing master to have enough
> > consumer thread to cover maximum number of partitions of a topic  since
> > high level consumer doesn't have assumption of consumers who are master
> and
> > consumers who are slave.
> >
> > For example, master A initiate enough thread such that it can cover all
> the
> > partitions. slave B is standby with same consumer group and same number
> of
> > threads but since master A has enough thread to cover all the partitions.
> > Slave B won't get any data.
> >
> > Suddenly master A goes down, slave B becomes new master, and it start to
> > get data based on high level consumer rebalance design.
> >
> > After that old master A comes up and becomes slave, will A get data ?
>  Or A
> > will not get data because B has enough thread to cover all partitions in
> > the rebalancing logic.
> >
> > Thanks,
> >
> > Weide
> >
> >
> >> On Fri, Aug 1, 2014 at 4:45 PM, Guozhang Wang <wangguoz@gmail.com>
> wrote:
> >>
> >> Hello Weide,
> >>
> >> That should be doable via high-level consumer, you can take a look at
> this
> >> page:
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
> >>
> >> Guozhang
> >>
> >>
> >>> On Fri, Aug 1, 2014 at 3:20 PM, Weide Zhang <weoccc@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have a use case for a master slave  cluster where the logic inside
> >> master
> >>> need to consume data from kafka and publish some aggregated data to
> kafka
> >>> again. When master dies, slave need to take the latest committed offset
> >>> from master and continue consuming the data from kafka and doing the
> >> push.
> >>>
> >>> My questions is what will be easiest kafka consumer design for this
> >>> scenario to work ? I was thinking about using simpleconsumer and doing
> >>> manual consumer offset syncing between master and slave. That seems to
> >>> solve the problem but I was wondering if it can be achieved by using
> high
> >>> level consumer client ?
> >>>
> >>> Thanks,
> >>>
> >>> Weide
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
>



-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message