kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugen Dueck <eu...@tworks.co.jp>
Subject Re: synchronously flushing messages to disk
Date Sat, 07 Mar 2020 10:55:48 GMT
I was under the impression that these settings
log.flush.interval.messages=1
log.flush.interval.ms=0
guarantee a synchronous fsync for every message, i.e.when the producer receives an ack for
a message, it is guaranteed to have been persisted to as many disks as min.insync.replicas
requires.

As I have heard other opinions on that, I'd like to know if someone in the Kafka community
can clarify.

Best regards
Eugen
________________________________
差出人: Eugen Dueck <eugen@tworks.co.jp>
送信日時: 2020年2月26日 13:28
宛先: users@kafka.apache.org <users@kafka.apache.org>
件名: synchronously flushing messages to disk

Hi

I want to benchmark Kafka, configured such that a message that has been acked by the broker
to the producer is guaranteed to have been persisted to disk. I changed the broker settings:

log.flush.interval.messages=1
log.flush.interval.ms=0

(Is this the proper way to do it?)

The impact is very noticeable. Whereas without these settings, the msg/sec rate (1 producer,
1 topic, async, enable.idempotence) was well north of 100k, with above settings it drops to
below 5k on this dev box with ssd storage. This huge drop seems to indicate that Kafka is
not doing any batch acking (which would allow it to do batch fsyncing).

Is there a way to increase the msg/sec rate given the fsync constraint? It would seem that
adding topics/partitions would help in case of a cluster, and the fsync load could be distributed
to multiple machines. Is there perhaps also a way to increase the rate per node?

I'm using the latest kafka 2.4.0.

Best regards
Eugen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message