kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ismael Juma <ism...@juma.me.uk>
Subject Re: Data loss while upgrading confluent 3.0.0 kafka cluster to confluent 3.2.2
Date Mon, 18 Sep 2017 16:32:52 GMT
Hi Yogesh,

Can you please clarify what you mean by "observing data loss"?

Ismael

On Mon, Sep 18, 2017 at 5:08 PM, Yogesh Sangvikar <
yogesh.sangvikar@gmail.com> wrote:

> Hi Team,
>
> Please help to find resolution for below kafka rolling upgrade issue.
>
> Thanks,
>
> Yogesh
>
> On Monday, September 18, 2017 at 9:03:04 PM UTC+5:30, Yogesh Sangvikar
> wrote:
>>
>> Hi Team,
>>
>> Currently, we are using confluent 3.0.0 kafka cluster in our production
>> environment. And, we are planing to upgrade the kafka cluster for confluent
>> 3.2.2
>> We are having topics with millions on records and data getting
>> continuously published to those topics. And, also, we are using other
>> confluent services like schema-registry, kafka connect and kafka rest to
>> process the data.
>>
>> So, we can't afford downtime upgrade for the platform.
>>
>> We have tries rolling kafka upgrade as suggested on blogs in Development
>> environment,
>>
>> https://docs.confluent.io/3.2.2/upgrade.html
>>
>> https://kafka.apache.org/documentation/#upgrade
>>
>> But, we are observing data loss on topics while doing rolling upgrade /
>> restart of kafka servers for "inter.broker.protocol.version=0.10.2".
>>
>> As per our observation, we suspect the root cause for the data loss
>> (explained for a topic partition having 3 replicas),
>>
>>    - As the kafka broker protocol version updates from 0.10.0 to 0.10.2
>>    in rolling fashion, the in-sync replicas having older version will not
>>    allow updated replicas (0.10.2) to be in sync unless are all updated.
>>    - Also, we have explicitly disabled "unclean.leader.election.enabled"
>>    property, so only in-sync replicas will be elected as leader for the given
>>    partition.
>>    - While doing rolling fashion update, as mentioned above, older
>>    version leader is not allowing newer version replicas to be in sync, so the
>>    data pushed using this older version leader, will not be synced with other
>>    replicas and if this leader(older version)  goes down for an upgrade, other
>>    updated replicas will be shown in in-sync column and become leader, but
>>    they lag in offset with old version leader and shows the offset of the data
>>    till they have synced.
>>    - And, once the last replica comes up with updated version, will
>>    start syncing data from the current leader.
>>
>>
>> Please let us know comments on our observation and suggest proper way for
>> rolling kafka upgrade as we can't afford downtime.
>>
>> Thanks,
>> Yogesh
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message