kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugen Dueck <eu...@tworks.co.jp>
Subject Re: log.dirs and SSDs
Date Thu, 12 Mar 2020 01:03:52 GMT
The reason I'm asking is because the documentation is quite brief:

num.io.threads: The number of threads that the server uses for processing requests, which
may include disk I/O

And none of the google hits revealed any more details.

________________________________

I'm asking the questions here! 🙂
So is that the way to tune the broker if it does not achieve disk throughput?

________________________________
差出人: Peter Bukowinski <pmbuko@gmail.com>
送信日時: 2020年3月12日 9:38

Couldn’t the same be accomplished by increasing the num.io.threads broker setting?

> On Mar 11, 2020, at 5:15 PM, Eugen Dueck <eugen@tworks.co.jp> wrote:
>
> So there is not e.g. a single thread responsible per directory in log.dirs that could
become a bottleneck relative to SSD throughput of GB/s?
>
> This is in fact the case for Apache Pulsar, and the openmessaging benchmark uses 4 directories
on the same SSD to increase throughput.
>
> ________________________________
> 差出人: Peter Bukowinski <pmbuko@gmail.com>
> 送信日時: 2020年3月12日 8:51
>
>> On Mar 11, 2020, at 4:28 PM, Eugen Dueck <eugen@tworks.co.jp> wrote:
>>
>> So log.dirs should contain only one entry per HDD disk, to avoid random seeks.
>> What about SSDs? Can throughput be increased by specifying multiple directories on
the same SSD?
>
>
> Given a constant number of partitions, I don’t see any advantage to splitting partitions
among multiple log directories vs. keeping them all in one (per disk). You’d still have
the same total number of topic-partition directories and the same number of topic-partition
leaders.
>
> If you want to increase throughput, focus on using the appropriate number of partitions.
>
> —
> Peter Bukowinski

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message