kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Cheng <wushuja...@gmail.com>
Subject Re: Under-replicated Partitions while rolling Kafka nodes in AWS
Date Thu, 05 Jan 2017 22:31:49 GMT

> On Jan 5, 2017, at 7:55 AM, Jack Lund <jack.lund@braintreepayments.com> wrote:
> Hello, all.
> We're running multiple Kafka clusters in AWS, and thus multiple Zookeeper
> clusters as well. When we roll out changes to our zookeeper nodes (which
> involves changes to the AMI, which means terminating the zookeeper instance
> and bringing up a new one in its place) we have to restart our Kafka
> brokers one at a time so they can pick up the new zookeeper IP address.

FYI, zookeeper 3.4.8 fixes the issue where you have to restart zookeeper nodes when their
DNS mapping changes. I'm not sure how it affects restarting kafka though, when the zookeeper
DNS changes.

https://zookeeper.apache.org/doc/r3.4.8/releasenotes.html <https://zookeeper.apache.org/doc/r3.4.8/releasenotes.html>
https://issues.apache.org/jira/browse/ZOOKEEPER-1506 <https://issues.apache.org/jira/browse/ZOOKEEPER-1506>

> What we've noticed is that, as the brokers are restarted, we get alerts for
> under-replicated partitions, which seems strange since it seems like the
> shutdown process should take care of moving any replicas and the leadership
> election process.

During a controlled shutdown, you are right that *leadership* is moved from one broker to
another. But the replica list does not change. A topic assigned to brokers 1 2 3 for example
will only live on 1 2 3. If broker 1 is the leader for the topic, then during controlled shutdown
of 1, leadership may move to 2 or 3. But a broker 4 would never automatically take over as
replica for the topic.

You can build such functionality yourself, if you wanted. You could, for example, move the
topic to 2 3 4 before shutting down 1, and then move it back to 1 2 3 once 1 is back up. But
that's a bunch of work you've have to do yourself.


> This is causing us some pain because it means that we get pages whenever we
> roll out changes to Zookeeper.
> Does anybody have any ideas why this would be happening, and how we can
> avoid it?
> Thanks.
> -Jack Lund
> Braintree Payments

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message