kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ján Koščo <3k.stan...@gmail.com>
Subject Re: Controlled shutdown not relinquishing leadership of all partitions
Date Wed, 13 Jan 2016 06:05:17 GMT
Not sure, but should combination of auto.leader.rebalance.enable=true
and controlled.shutdown.enable=true sort this out for you?

2016-01-13 1:13 GMT+01:00 Scott Reynolds <sreynolds@twilio.com>:

> we use 0.9.0.0 and it is working fine. Not all the features work and a few
> things make a few assumptions about how zookeeper is used. But as a tool
> for provisioning, expanding and failure recovery it is working fine so far.
>
> *knocks on wood*
>
> On Tue, Jan 12, 2016 at 4:08 PM, Luke Steensen <
> luke.steensen@braintreepayments.com> wrote:
>
> > Ah, that's a good idea. Do you know if kafka-manager works with kafka 0.9
> > by chance? That would be a nice improvement of the cli tools.
> >
> > Thanks,
> > Luke
> >
> >
> > On Tue, Jan 12, 2016 at 4:53 PM, Scott Reynolds <sreynolds@twilio.com>
> > wrote:
> >
> > > Luke,
> > >
> > > We practice the same immutable pattern on AWS. To decommission a
> broker,
> > we
> > > use partition reassignment first to move the partitions off of the node
> > and
> > > preferred leadership election. To do this with a web ui, so that you
> can
> > > handle it on lizard brain at 3 am, we have the Yahoo Kafka Manager
> > running
> > > on the broker hosts.
> > >
> > > https://github.com/yahoo/kafka-manager
> > >
> > > On Tue, Jan 12, 2016 at 2:50 PM, Luke Steensen <
> > > luke.steensen@braintreepayments.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > We've run into a bit of a head-scratcher with a new kafka deployment
> > and
> > > > I'm curious if anyone has any ideas.
> > > >
> > > > A little bit of background: this deployment uses "immutable
> > > infrastructure"
> > > > on AWS, so instead of configuring the host in-place, we stop the
> > broker,
> > > > tear down the instance, and replace it wholesale. My understanding
> was
> > > that
> > > > controlled shutdown combined with producer retries would allow this
> > > > operation to be zero-downtime. Unfortunately, things aren't working
> > quite
> > > > as I expected.
> > > >
> > > > After poring over the logs, I pieced together to following chain of
> > > events:
> > > >
> > > >    1. our operations script stops the broker process and proceeds to
> > > >    terminate the instance
> > > >    2. our producer application detects the disconnect and requests
> > > updated
> > > >    metadata from another node
> > > >    3. updated metadata is returned successfully, but the downed
> broker
> > is
> > > >    still listed as leader for a single partition of the given topic
> > > >    4. on the next produce request bound for that partition, the
> > producer
> > > >    attempts to initiate a connection to the downed host
> > > >    5. because the instance has been terminated, the node is now in
> the
> > > >    "connecting" state until the system-level tcp timeout expires (2-3
> > > > minutes)
> > > >    6. during this time, all produce requests to the given partition
> sit
> > > in
> > > >    the record accumulator until they expire and are immediately
> failed
> > > > without
> > > >    retries
> > > >    7. the tcp timeout finally fires, the node is recognized as
> > > >    disconnected, more metadata is fetched, and things return to
> sanity
> > > >
> > > > I was able to work around the issue by waiting 60 seconds between
> > > shutting
> > > > down the broker and terminating that instance, as well as raising
> > > > request.timeout.ms on the producer to 2x our zookeeper timeout. This
> > > gives
> > > > the broker a much quicker "connection refused" error instead of the
> > > > connection timeout and seems to give enough time for normal failure
> > > > detection and leader election to kick in before requests are timed
> out.
> > > >
> > > > So two questions really: (1) are there any known issues that would
> > cause
> > > a
> > > > controlled shutdown to fail to release leadership of all partitions?
> > and
> > > > (2) should the producer be timing out connection attempts more
> > > proactively?
> > > >
> > > > Thanks,
> > > > Luke
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message