kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Awatramani <Sanjay.Awatram...@guavus.com>
Subject Re: Reliability against rack failure
Date Sun, 05 Aug 2018 18:31:05 GMT
Thanks for the quick response Svante.
I forgot to mention that the deployment I am looking at has 2 racks. We
came up with this solution, but for this specific deployment adding a rack
is out of question.
Is there a way to resolve this with 2 racks ?

Regards,
Sanjay

On 05/08/18, 11:57 PM, "Svante Karlsson" <svante.karlsson@csi.se> wrote:

>3 racks,  Replication Factor = 3, min.insync.replicas=2, ack=all
>
>2018-08-05 20:21 GMT+02:00 Sanjay Awatramani
><Sanjay.Awatramani@guavus.com>:
>
>> Hi,
>>
>> I have done some experiments and gone through kafka documentation, which
>> makes me conclude that there is a small chance of data loss or
>>availability
>> in a rack scenario. Can someone please validate my understanding ?
>>
>> The minimum configuration for a single rack system against single
>>machine
>> failure is Replication Factor = 3, min.insync.replicas=2, ack=all. This
>> will ensure that leader + at least one replica receives the data
>>written by
>> a producer and there will be no data loss as well as the system
>>continues
>> to be available for further writes by the producer when a broker goes
>>down.
>>
>> With rack awareness enabled, Kafka will distribute replicas of a
>>partition
>> across racks, giving reliability in case of rack failure. However rack
>> awareness is only concerned with distribution of replicas, not
>>prioritising
>> the order of replication when followers catch up with the leader.
>>
>> Moving to a rack aware setup which has 2 racks, the above configuration
>> would create a problem because one of the racks might get 2 replicas
>>and if
>> that rack goes down, data will be lost.
>>
>> Extending the minimum configuration for a 2 rack setup, Replication
>>Factor
>> = 4, min.insync.replicas=2, ack=all. This will ensure that when a rack
>>goes
>> down, one of the replicas will be available as it would be on a
>>different
>> rack than the leader. This was my understanding and I cannot find any
>> documentation to back this. I studied the mechanism by which producer
>> writes to leader - all IN SYNC REPLICAS (ISR) pull the newest data, and
>>if
>> the leader confirms that at least min.insync.replicas have got the
>>newest
>> data, it sends an ack back to the producer. In a rack aware system, I
>>think
>> Kafka will send an ack even if the 2 replicas which are in sync are on
>>the
>> same rack. And at this instant if that rack goes down, data is lost.
>>
>> If we make min.insync.replicas=3, we can guarantee that one of the
>> replicas will be on a different rack and data will not be lost. However
>>if
>> any rack goes down, producer¹s writes will start failing as it won¹t
>>have
>> the requisite replicas available.
>>
>> Is my understanding correct ? Is there a way to configure Kafka in a
>>rack
>> scenario to make it tolerant to data loss as well as make it available
>>for
>> further writes even when a single node or an entire rack goes down ?
>>
>> Regards,
>> Sanjay
>>
>>


Mime
View raw message