kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pieter Hameete <pieter.hame...@blockbax.com>
Subject Kafka broker occasionally takes a very long time for log loading on startup
Date Mon, 08 Jun 2020 12:08:16 GMT

we are observing Kafka brokers occasionally taking a very long time to load logs on startup
compared to usual (40 minutes versus 30-60 seconds). This happens when restarting a broker
(as part of a rolling restart following the procedure prescribed by Confluent here https://docs.confluent.io/current/kafka/post-deployment.html#rolling-restart).
The broker has enabled controlled shutdown which was reported to be successful.

We use Confluent Platform 5.5.0 (Kafka 2.5.0), and 3 brokers with a replication factor of
3 and at least 2 in-sync replicas. Each broker uses 1TB of AWS EBS for log storage.

Some observations:

  *    It is not always the same broker that take a long time to start up
  *   It does not only occur for the broker that is the active controller
  *   A topic partition that laods quickly normally (15ms) can take a long time (9549ms) for
the same broker only a day later.
  *   We experienced this on Kafka 2.4.0 as well, though it did not occur for a few weeks
after upgrading to 2.5.0
  *   EBS throughput is not lower for slow startup cases compared to fast startup.
  *   The logs do not show any errors on startup when this happens.

Does anyone have an idea what could be causing this? Or any suggestions on further tests to
conduct to get closer to the cause?

Thank you in advance and have a good day!

-- Pieter Hameete

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message