kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pieter Hameete <pieter.hame...@blockbax.com>
Subject Re: Kafka broker occasionally takes a very long time for log loading on startup
Date Mon, 08 Jun 2020 21:06:36 GMT
Hi Liam,

no, none of the indexes are corrupted. There are no errors in the log at startup.


Van: Liam Clarke-Hutchinson <liam.clarke@adscale.co.nz>
Verzonden: maandag 8 juni 2020 22:31
Aan: users@kafka.apache.org <users@kafka.apache.org>
Onderwerp: Re: Kafka broker occasionally takes a very long time for log loading on startup

Hi Pieter,

Do the logs on the slow broker show a lot of "rebuilding corrupted indexes"


Liam Clarke-Hutchinson

On Tue, 9 Jun. 2020, 12:08 am Pieter Hameete, <pieter.hameete@blockbax.com>

> Hello,
> we are observing Kafka brokers occasionally taking a very long time to
> load logs on startup compared to usual (40 minutes versus 30-60 seconds).
> This happens when restarting a broker (as part of a rolling restart
> following the procedure prescribed by Confluent here
> https://docs.confluent.io/current/kafka/post-deployment.html#rolling-restart).
> The broker has enabled controlled shutdown which was reported to be
> successful.
> We use Confluent Platform 5.5.0 (Kafka 2.5.0), and 3 brokers with a
> replication factor of 3 and at least 2 in-sync replicas. Each broker uses
> 1TB of AWS EBS for log storage.
> Some observations:
>   *    It is not always the same broker that take a long time to start up
>   *   It does not only occur for the broker that is the active controller
>   *   A topic partition that laods quickly normally (15ms) can take a long
> time (9549ms) for the same broker only a day later.
>   *   We experienced this on Kafka 2.4.0 as well, though it did not occur
> for a few weeks after upgrading to 2.5.0
>   *   EBS throughput is not lower for slow startup cases compared to fast
> startup.
>   *   The logs do not show any errors on startup when this happens.
> Does anyone have an idea what could be causing this? Or any suggestions on
> further tests to conduct to get closer to the cause?
> Thank you in advance and have a good day!
> -- Pieter Hameete

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message