kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stevo Slavić <ssla...@gmail.com>
Subject Re: Kafka settings for (more) reliable/durable messaging
Date Tue, 07 Jul 2015 20:26:59 GMT
Great feedback, thank you very much to both!

Kind regards,
Stevo Slavic.

On Tue, Jul 7, 2015 at 7:33 PM, Jiangjie Qin <jqin@linkedin.com.invalid>
wrote:

> The replica lag definition now is time based, so as long as a replica can
> catch up with leader in replica.lag.time.max.ms, it is in ISR, no matter
> how many messages it is behind.
>
> And yes, your understanding is correct - ACK is sent back either when all
> replica in ISR got the message or the request timeout.
>
> I had some related slides here might help a bit.
> http://www.slideshare.net/JiangjieQin/no-data-loss-pipeline-with-apache-kaf
> ka-49753844
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On 7/7/15, 9:28 AM, "Stevo Slavić" <sslavic@gmail.com> wrote:
>
> >Thanks for heads up and code reference!
> >
> >Traced back required offset to
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/serve
> >r/ReplicaManager.scala#L303
> >
> >Have to investigate more, but from initial check was expecting to see
> >there
> >reference to "replica.lag.max.messages" (so even when replica is between 0
> >and maxLagMessages behind to be considered on required offset to be
> >considered as insync). Searching through trunk cannot find where in main
> >code is "replica.lag.max.messages" configuration property used.
> >
> >Used search query
> >
> https://github.com/apache/kafka/search?utf8=%E2%9C%93&q=%22replica.lag.max
> >.messages%22&type=Code
> >
> >Maybe it's going to be removed in next release?!
> >
> >Time based lag is still there.
> >
> >Anyway, if I understood correctly, with request.required.acks=-1, when a
> >message/batch is published, it's first written to lead, then other
> >partition replicas either continuously poll and get in sync with lead, or
> >through zookeeper get notified that they are behind and poll and get in
> >sync with lead, and as soon as enough (min.insync.replicas - 1) replicas
> >are detected to be fully in sync with lead, ACK is sent to producer
> >(unless
> >timeout occurs first).
> >
> >On Tue, Jul 7, 2015 at 5:15 PM, Gwen Shapira <gshapira@cloudera.com>
> >wrote:
> >
> >> Ah, I think I see the confusion: Replicas don't actually ACK at all.
> >> What happens is that the replica manager waits for enough ISR replicas
> >> to reach the correct offset
> >> Partition.checkEnoughReplicasReachOffset(...) has this logic. A
> >> replica can't reach offset of second batch, without first having
> >> written the first batch. So I believe we are safe in this scenario.
> >>
> >> Gwen
> >>
> >> On Tue, Jul 7, 2015 at 8:01 AM, Stevo Slavić <sslavic@gmail.com> wrote:
> >> > Hello Gwen,
> >> >
> >> > Thanks for fast response!
> >> >
> >> > Btw, congrats on officially becoming a Kafka committer and thanks,
> >>among
> >> > other things, for great "Intro to Kafka" video
> >> > http://shop.oreilly.com/product/0636920038603.do !
> >> >
> >> > Have to read more docs and/or source. I thought this scenario is
> >>possible
> >> > because replica can fall behind (replica.lag.max.messages) and still
> >>be
> >> > considered ISR. Then I assumed also write can be ACKed by any ISR, and
> >> then
> >> > why not by one which has fallen more behind.
> >> >
> >> > Kind regards,
> >> > Stevo Slavic.
> >> >
> >> > On Tue, Jul 7, 2015 at 4:47 PM, Gwen Shapira <gshapira@cloudera.com>
> >> wrote:
> >> >
> >> >> I am not sure "different replica" can ACK the second back of messages
> >> >> while not having the first - from what I can see, it will need to be
> >> >> up-to-date on the latest messages (i.e. correct HWM) in order to ACK.
> >> >>
> >> >> On Tue, Jul 7, 2015 at 7:13 AM, Stevo Slavić <sslavic@gmail.com>
> >>wrote:
> >> >> > Hello Apache Kafka community,
> >> >> >
> >> >> > Documentation for min.insync.replicas in
> >> >> > http://kafka.apache.org/documentation.html#brokerconfigs states:
> >> >> >
> >> >> > "When used together, min.insync.replicas and request.required.acks
> >> allow
> >> >> > you to enforce greater durability guarantees. A typical scenario
> >> would be
> >> >> > to create a topic with a replication factor of 3, set
> >> min.insync.replicas
> >> >> > to 2, and produce with request.required.acks of -1. This will
> >>ensure
> >> that
> >> >> > the producer raises an exception if a majority of replicas do
not
> >> >> receive a
> >> >> > write."
> >> >> >
> >> >> > Correct me if wrong (doc reference?), I assume min.insync.replicas
> >> >> includes
> >> >> > lead, so with min.insync.replicas=2, lead and one more replica
> >>besides
> >> >> lead
> >> >> > will have to ACK writes.
> >> >> >
> >> >> > In such setup, with minimalistic 3 brokers cluster, given that
> >> >> > - all 3 replicas are insync
> >> >> > - a batch of messages is written and ends up on lead and one
> >>replica
> >> ACKs
> >> >> > - another batch of messages ends up on lead and different replica
> >>ACKs
> >> >> >
> >> >> > Is it possible that when lead crashes, while replicas didn't catch
> >>up,
> >> >> > (part of) one batch of messages could be lost (since one replica
> >> becomes
> >> >> a
> >> >> > new lead, and it's only serving all reads and requests, and
> >> replication
> >> >> is
> >> >> > one way)?
> >> >> >
> >> >> > Kind regards,
> >> >> > Stevo Slavic.
> >> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message