kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <g...@confluent.io>
Subject Re: Log Retention: What gets deleted
Date Sun, 10 Apr 2016 16:54:06 GMT
Agree! Thats a serious problem. We are trying to fix this in the upcoming
release.

Gwen

On Fri, Apr 8, 2016 at 2:56 PM, Anandha L Ranganathan <analog.sony@gmail.com
> wrote:

> Thanks.
>
> I have seen this in our system would like to understand the behavior of the
> log segment.
>
> How the log segment will get deleted in the case of one of the ISR moved to
> the new node.
> Say for an example currently my ISR nodes {1,2,3} for the partition-0.  Due
> to some reason  after 2 days the new ISR nodes are {2,3,4}.
> Brokers {2,3} will contains some log segment creation date  as T1 for the
> partition-0
> Broker {4} has different log segment creation date as T2 for the
> partition-0.
>
> The deletion of log segment will be based on broker {4} or brokers
> {2,3}.    We noticed that latest timestamp of  log segment applies and it
> sometime requires more disk space than anticipated.
>
>
>
>
>
> On Fri, Apr 8, 2016 at 1:07 PM Gwen Shapira <gwen@confluent.io> wrote:
>
> > Yes. It is whichever is shorter :)
> >
> > Another clarification:
> > A segment is deleted as a whole, based on the newest event in the
> segment.
> > So if the newest event is too recent to delete, the older events in the
> > segment will also be kept around.
> >
> > On Fri, Apr 8, 2016 at 12:52 PM, Anandha L Ranganathan <
> > analog.sony@gmail.com> wrote:
> >
> > > Just a clarification based on Gwen's reply
> > >
> > > *log.segment.bytes*  - by default this property is set to 1 GB.
> > > If we haven't set any value for  *log.roll.ms <http://log.roll.ms>* ,
> > > again
> > > by default it is set to 168 hours.  In that case  after every 1 GB,
> will
> > it
> > > roll out new log segment file ?
> > >
> > >
> > >
> > >
> > >
> > > <http://log.roll.ms>
> > >
> > > On Fri, Apr 8, 2016 at 11:32 AM Heath Ivie <hivie@autoanything.com>
> > wrote:
> > >
> > > > Gwen,
> > > >
> > > > Thanks for the detailed reply.
> > > >
> > > > That makes it more clear for me.
> > > >
> > > > Heath
> > > >
> > > > -----Original Message-----
> > > > From: Gwen Shapira [mailto:gwen@confluent.io]
> > > > Sent: Tuesday, April 05, 2016 6:13 PM
> > > > To: users@kafka.apache.org
> > > > Subject: Re: Log Retention: What gets deleted
> > > >
> > > > I think you got it almost right. The missing part is that we only
> > delete
> > > > whole partition segments, not individual messages.
> > > >
> > > > As you are writing messages, every X bytes or Y milliseconds, a new
> > file
> > > > gets created for the partition to store new messages in. Those files
> > are
> > > > called segments.
> > > > The segment you are currently writing to is an active segment.
> > > >
> > > > We will never delete an active segment, so in order to delete old
> > > messages
> > > > we will look for an inactive segment where the newest message is
> older
> > > than
> > > > our retention and delete the entire segment.
> > > >
> > > > So there are several parameters controlling when will data get
> deleted
> > > > (I'm looking at just the time based, not the size-based):
> > > > 1. log.retention.ms - how old messages should be before we consider
> > them
> > > > for deletion 2. log.roll.ms - how frequently we roll new segments.
> > > > Messages will not get deleted before a new segment is rolled 3.
> > > > log.retention.check.interval.ms - how frequently we check for
> segments
> > > > that we can delete.
> > > >
> > > > A message will be deleted if all 3 are true:
> > > > 1. It is older than log.retention.ms
> > > > 2. It is in an inactive segment, meaning enough time passed since the
> > > > message was written to roll a new segment 3. Kafka checked for
> segments
> > > > that can be deleted, meaning that more than check.interval.ms time
> > > passed
> > > > since the segment was rolled.
> > > >
> > > > Hope this helps,
> > > >
> > > > Gwen
> > > >
> > > >
> > > >
> > > > On Fri, Apr 1, 2016 at 12:21 PM, Heath Ivie <hivie@autoanything.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I have some questions about the log retention and specifically what
> > > > > gets deleted.
> > > > >
> > > > > I have a test app where I am writing 10 logs to the topic every
> > second.
> > > > >
> > > > > What I would expect is a lag in a group would be somewhere around
> 10
> > > > > if I have retention.ms at 1000.
> > > > >
> > > > > What I am seeing that the lag continues to grow, but then at some
> > > > > point all messages are gone and the lag is at 0.
> > > > >
> > > > > I thought that the messages that are old would be deleted first.
> > > > >
> > > > > Am I misinterpreting how the log retention works?
> > > > >
> > > > > Heath Ivie
> > > > > Solutions Architect
> > > > >
> > > > >
> > > > > Warning: This e-mail may contain information proprietary to
> > > > > AutoAnything Inc. and is intended only for the use of the intended
> > > > > recipient(s). If the reader of this message is not the intended
> > > > > recipient(s), you have received this message in error and any
> review,
> > > > > dissemination, distribution or copying of this message is strictly
> > > > > prohibited. If you have received this message in error, please
> notify
> > > > > the sender immediately and delete all copies.
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message