kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jayesh Thakrar <j_thak...@yahoo.com.INVALID>
Subject Re: Question: Data Loss and Data Duplication in Kafka
Date Tue, 06 Sep 2016 03:52:07 GMT
Thanks Radha Krishna!
So from what I understand, data loss can happen at producer due to BufferExhaustedException,
failure to close/terminate producer and due to communication errors (first figure below).
And at the broker during unclean leader election (i.e. electing a leader that was not in ISR)
or if a leader crashes and the min.insync.replicas configuration for the topic/broker was
1, in which case there is a potential loss of small number of messages as contained within
the replica.lag.time.max.ms duration.  
Also wondering if you and the community can validate these two pictures below?









      From: R Krishna <krishna81m@gmail.com>
 To: users@kafka.apache.org; Jayesh Thakrar <j_thakrar@yahoo.com> 
 Sent: Tuesday, August 30, 2016 2:02 AM
 Subject: Re: Question: Data Loss and Data Duplication in Kafka
   
Experimenting with kafka myself, and found timeouts/batch expiry (valid and invalid configurations),
and max retries also can drop messages unless you handle and log them gracefully. There are
also a bunch of org.apache.kafka.common.KafkaException hierarchy exceptions some of which
are thrown for valid reasons but also drop messages like size of messages, buffer size, etc.,.


On Sun, Aug 28, 2016 at 1:55 AM, Jayesh Thakrar <j_thakrar@yahoo.com.invalid> wrote:

I am looking at ways how one might have data loss and duplication in a Kafka cluster and need
some help/pointers/discussions.
So far, here's what I have come up with:
Loss at producer-sideSince the data send call is actually adding data to a cache/buffer, a
crash of the producer can potentially result in data loss.Another scenario for data loss is
a producer exiting without closing the producer connection.
Loss at broker-sideI think there are several situations here - all of which are triggered
by a broker or controller crash or network issues with zookeepers (kind of simulating broker
crashes). 
If I understand correctly, KAFKA-1211 (https://issues.apache.org/ jira/browse/KAFKA-1211)
implies that when acks is set to 0/1 and the leader crashes, there is a probability of data
loss. Hopefully implementation of leader generation will help avoid this (https://issues.apache.org/
jira/browse/KAFKA-1211? focusedCommentId=15402622& page=com.atlassian.jira. plugin.system.issuetabpanels:
comment-tabpanel#comment- 15402622)
And a unique situation as described in KAFKA-3410 (https://issues.apache.org/ jira/browse/KAFKA-3410)
can cause broker or cluster shutdown leading to data loss as described in KAFKA-3924 (resolved
in 0.10.0.1).
And data duplication can attributed primarily to consumer offset management which is done
at batch/periodic intervals.
Can anyone think or know of any other scenarios?
Thanks,Jayesh







-- 
Radha Krishna, Proddaturi253-234-5657

   
Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message