kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <joe.st...@stealth.ly>
Subject Re: undesirable log retention behavior
Date Fri, 01 Aug 2014 02:39:24 GMT
What version of Kafka are your using? Have you tried log.retention.bytes?
Which ever comes first (ttl or bytes total) should do what you are looking
for if I understand you right.
http://kafka.apache.org/documentation.html#brokerconfigs

/*******************************************
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop
********************************************/
On Jul 31, 2014 6:52 PM, "Steven Wu" <stevenwu@netflix.com.invalid> wrote:

> it seems that log retention is purely based on last touch/modified
> timestamp. This is undesirable for code push in aws/cloud.
>
> e.g. let's say retention window is 24 hours. disk size is 1 TB. disk util
> is 60% (600GB). when new instance comes up, it will fetch log files (600GB)
> from peers. those log files all have newer timestamps. they won't be purged
> until 24 hours later. note that during the first 24 hours, new msgs
> (another 600GB) continue to come in. This can cause disk full problem
> without any intervention. With this behavior, we have to keep disk util
> under 50%.
>
> can last modified timestamp be inserted into the file name when rolling
> over log files? then kafka can check the file name for timestamp. does this
> make sense?
>
> Thanks,
> Steven
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message