kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Compton <d...@danielcompton.net>
Subject Re: kafka consumer fail over
Date Sun, 03 Aug 2014 13:25:03 GMT
Hi Weide

The consumer rebalancing algorithm is deterministic. In your failure scenario, when A comes
back up again, the consumer threads will rebalance. This will give you the initial consumer
configuration at the start of the test. 

I'm unsure whether the partitions are balanced round robin, or if they will all go to A, then
the overflow to B. 

If all of the messages need to be processed by a single machine, an alternative architecture
would be to have a standby server that waits until master A fails and then connects as a consumer.
This could be accomplished by watching Zookeeper and getting a notification when A's ephemeral
node is removed. 

The high level consumer does seem to be the way to go as long as your application can handle
duplicate processing. 

Daniel.

> On 2/08/2014, at 1:38 pm, Weide Zhang <weoccc@gmail.com> wrote:
> 
> Hi Guozhang,
> 
> If I use high level consumer, how do I ensure all data goes to master even
> if slave was up and running ? Is it just by forcing master to have enough
> consumer thread to cover maximum number of partitions of a topic  since
> high level consumer doesn't have assumption of consumers who are master and
> consumers who are slave.
> 
> For example, master A initiate enough thread such that it can cover all the
> partitions. slave B is standby with same consumer group and same number of
> threads but since master A has enough thread to cover all the partitions.
> Slave B won't get any data.
> 
> Suddenly master A goes down, slave B becomes new master, and it start to
> get data based on high level consumer rebalance design.
> 
> After that old master A comes up and becomes slave, will A get data ?  Or A
> will not get data because B has enough thread to cover all partitions in
> the rebalancing logic.
> 
> Thanks,
> 
> Weide
> 
> 
>> On Fri, Aug 1, 2014 at 4:45 PM, Guozhang Wang <wangguoz@gmail.com> wrote:
>> 
>> Hello Weide,
>> 
>> That should be doable via high-level consumer, you can take a look at this
>> page:
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
>> 
>> Guozhang
>> 
>> 
>>> On Fri, Aug 1, 2014 at 3:20 PM, Weide Zhang <weoccc@gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> I have a use case for a master slave  cluster where the logic inside
>> master
>>> need to consume data from kafka and publish some aggregated data to kafka
>>> again. When master dies, slave need to take the latest committed offset
>>> from master and continue consuming the data from kafka and doing the
>> push.
>>> 
>>> My questions is what will be easiest kafka consumer design for this
>>> scenario to work ? I was thinking about using simpleconsumer and doing
>>> manual consumer offset syncing between master and slave. That seems to
>>> solve the problem but I was wondering if it can be achieved by using high
>>> level consumer client ?
>>> 
>>> Thanks,
>>> 
>>> Weide
>> 
>> 
>> 
>> --
>> -- Guozhang
>> 

Mime
View raw message