kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Kreps <jay.kr...@gmail.com>
Subject Re: Significance of multiple segment files in a partition
Date Tue, 25 Oct 2011 14:09:26 GMT
It is actually just to allow data deletion, we just delete whole segments in
the cleanup. There is not much value to tuning the file size for most
situations, but the tradeoff is that with smaller files you will have more
open files but be closer to your desired retention.hours and retention.size
settings.

-Jay

On Tue, Oct 25, 2011 at 1:59 AM, Inder Pall <inder.pall@gmail.com> wrote:

> i am playing around with "log.file.size"(controls the size of a segment
> file
> in a partition) and "log.retention.hours" with the following config.
> log.file.size=500
> log.retention.hours=168
>
> Observation - i see multiple files getting generated within the same
> partition.
> Example : my topic name is revenue feed and i see the following
>
> ls -lh /tmp/kafka-logs/revenuefeed-0/*
> -rw-r--r-- 1 inder users 537 Oct 25 01:38
> /tmp/kafka-logs/revenuefeed-0/00000000000000000000.kafka
> -rw-r--r-- 1 inder users 512 Oct 25 01:39
> /tmp/kafka-logs/revenuefeed-0/00000000000000000537.kafka
>
> Questions
> --------------
> 1. Shouldn't these two properties go hand in hand
> 2. Why would you want to have multiple files within a partition. Broker has
> to store more info to figure the right file among a partition.
> 3. Is it to achieve mmap kinda optimization and allowing the broker to do
> less I/O in case a feed is really huge or any thing else.
>
> -- Inder
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message