kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Kreps <jay.kr...@gmail.com>
Subject Re: large amount of disk space freed on restart
Date Mon, 09 Sep 2013 17:47:38 GMT
This could certainly be done. It would be slightly involved since you would
need to implement some kind of file-handle cache for both indexes and log
files and re-open them on demand when a read occurs. If someone wants to
take a shot at this, the first step would be to get a design wiki in place
on how this would work. This is potentially nice to reduce the open file
count (though open files are pretty cheap).

That said this issue only impacts xfs and it seems to be fixed by that
setting jonathan found. I wonder if you could give that a try and see if it
works for you too? I feel dealing with closed files does add a lot of
complexity so if there is an easy fix I would probably rather avoid it.

-Jay


On Mon, Sep 9, 2013 at 8:17 AM, Mike Heffner <mike@librato.com> wrote:

> We are also seeing this problem with version 0.7.1 and logs on an XFS
> partition. At our largest scale we can frequently free over 600GB of disk
> usage by simply restarting Kafka. We've examined the `lsof` output from the
> Kafka process and while it does appear to have FDs open for all log files
> on disk (even those long past read from), it does not have any files open
> that were previously deleted from disk.
>
> Du output agrees that the seen size is much larger than apparent-size size:
>
> root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h
> 00000000242666442619.kafka
> 1.1G 00000000242666442619.kafka
> root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h --apparent-size
> 00000000242666442619.kafka
> 513M 00000000242666442619.kafka
>
>
> Our log size/retention policy is:
>
> log.file.size=536870912
> log.retention.hours=96
>
> We tried dropping the caches from the Stack Overflow suggestion (sync; echo
> 3 > /proc/sys/vm/drop_caches) but that didn't seem to clear up the extra
> space. Haven't had the chance to try remounting with the allocsize option.
>
> In summary, it would be great if Kafka would close FD's to log files that
> hadn't been read from for some period of time if it addresses this issue.
>
> Cheers,
>
> Mike
>
> On Fri, Jul 26, 2013 at 5:03 PM, Jay Kreps <jay.kreps@gmail.com> wrote:
>
> > Cool, good to know.
> >
> >
> > On Fri, Jul 26, 2013 at 2:00 PM, Jason Rosenberg <jbr@squareup.com>
> wrote:
> >
> > > Jay,
> > >
> > > My only experience so far with this is using XFS.  It appears the XFS
> > > behavior is evolving, and in fact, we see somewhat different behavior
> > from
> > > 2 of our CentOS kernel versions in use.  I've been trying to ask
> > questions
> > > about all this on the XFS.org mailing list, but so far, having not much
> > > luck understanding the xfs versioning correlated to CentOS versions.
> > >
> > > Anyway, yes, I think it would definitely be worth trying the solution
> you
> > > suggest, which would be to close the file on rotation, and re-open
> > > read-only.  Or to close files after a few hours of not being accessed.
> > If
> > > a patch for one of these approaches can be cobbled together, I'd love
> to
> > > test it out on our staging environment.  I'd be willing to experiment
> > with
> > > such a patch myself, although I'm not 100% of all the places to look
> (but
> > > might dive in).
> > >
> > > Xfs appears to the option of using dynamic, speculative preallocation,
> in
> > > which case it progressively doubles the amount of space reserved for a
> > > file, as the file grows.  It does do this for all open files.  If the
> > file
> > > is closed, it will then release the preallocated space not in use.
>  It's
> > > not clear whether this releasing of space happens immediately on close,
> > and
> > > whether re-opening the file read-only immediately, will keep it from
> > > releasing space (still trying to gather more info on that).
> > >
> > > I haven't looked too much at the index files, but those too appear to
> > have
> > > this behavior (e.g. preallocated size is always on the order of double
> > the
> > > actual size, until the app is restarted).
> > >
> > > Jason
> > >
> > >
> > > On Fri, Jul 26, 2013 at 12:46 PM, Jay Kreps <jay.kreps@gmail.com>
> wrote:
> > >
> > > > Interesting.
> > > >
> > > > Yes, Kafka keeps all log files open indefinitely. There is no
> inherent
> > > > reason this needs to be the case, though, it would be possible to LRU
> > out
> > > > old file descriptors and close them if they are not accessed for a
> few
> > > > hours and then reopen on the first access. We just haven't
> implemented
> > > > anything like that.
> > > >
> > > > It would be good to understand this a little better. Does xfs
> > > pre-allocate
> > > > space for all open files? Perhaps just closing the file on log role
> and
> > > > opening it read-only would solve the issue? Is this at all related to
> > the
> > > > use of sparse files for the indexes (i.e.
> > > RandomAccessFile.setLength(10MB)
> > > > when we create the index)? Does this effect other filesystems or just
> > > xfs?
> > > >
> > > > -Jay
> > > >
> > > >
> > > > On Fri, Jul 26, 2013 at 12:42 AM, Jason Rosenberg <jbr@squareup.com>
> > > > wrote:
> > > >
> > > > > It looks like xfs will reclaim the preallocated space for a file,
> > after
> > > > it
> > > > > is closed.
> > > > >
> > > > > Does kafka close a file after it has reached it's max size and
> > started
> > > > > writing to the next log file in sequence?  Or does it keep all open
> > > until
> > > > > they are deleted, or the server quits (that's what it seems like).
> > > > >
> > > > > I could imagine that it might need to keep log files open, in order
> > to
> > > > > allow consumers access to them.  But does it keep them open
> > > indefinitely,
> > > > > after there is no longer any data to be written to them, and no
> > > consumers
> > > > > are currently attempting to read from them?
> > > > >
> > > > > Jason
> > > > >
> > > > >
> > > > > On Tue, Jul 16, 2013 at 4:32 PM, Jay Kreps <jay.kreps@gmail.com>
> > > wrote:
> > > > >
> > > > > > Interesting. Yes it will respect whatever setting it is given
for
> > new
> > > > > > segments created from that point on.
> > > > > >
> > > > > > -Jay
> > > > > >
> > > > > >
> > > > > > On Tue, Jul 16, 2013 at 11:23 AM, Jason Rosenberg <
> > jbr@squareup.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Ok,
> > > > > > >
> > > > > > > An update on this.  It seems we are using XFS, which is
> available
> > > in
> > > > > > newer
> > > > > > > versions of Centos.  It definitely does pre-allocate space
as a
> > > file
> > > > > > grows,
> > > > > > > see:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://serverfault.com/questions/406069/why-are-my-xfs-filesystems-suddenly-consuming-more-space-and-full-of-sparse-file
> > > > > > >
> > > > > > > Apparently it's not hard-allocated space, and would be
released
> > > under
> > > > > > > resource pressure....seems we may need to update how we
monitor
> > > disk
> > > > > > space
> > > > > > > usage, etc....
> > > > > > >
> > > > > > > But, it seems that the default log file size of 1.1Gb,
causes
> it
> > to
> > > > > jump
> > > > > > to
> > > > > > > preallocate an extra Gb.  So, in theory, if  I set a strategic
> > log
> > > > file
> > > > > > > size to be just under the threshold that forces it to
> > exponentially
> > > > > > double
> > > > > > > the size from 1Gb to 2Gb, I should be able to mostly solve
this
> > > > issue.
> > > > > > >  E.g. use 950Mb instead of 1.1Gb max log file size.
> > > > > > >
> > > > > > > If I change the max log file size for a broker, and restart
it,
> > > will
> > > > it
> > > > > > > respect the new size going forward?
> > > > > > >
> > > > > > > Jason
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Jul 14, 2013 at 9:44 AM, Jay Kreps <
> jay.kreps@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Hey Jason,
> > > > > > > >
> > > > > > > > As Jun says, we haven't seen that issue and no one
else has
> > > > reported
> > > > > > that
> > > > > > > > but it sounds like a bug of some kind.
> > > > > > > >
> > > > > > > > In 0.7 we don't do any preallocation of anything.
The only
> time
> > > > files
> > > > > > > > shrink is during recovery--we re-checksum all messages
that
> may
> > > not
> > > > > > have
> > > > > > > > been flushed and if any invalid messages are found
we
> truncate
> > > them
> > > > > > off.
> > > > > > > > This would only occur after an unclean shutdown (kill
-9) and
> > you
> > > > > would
> > > > > > > get
> > > > > > > > a bunch of errors in log warning you this was happening.
> > > > > > > >
> > > > > > > > -Jay
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Jul 14, 2013 at 1:36 AM, Jason Rosenberg <
> > > jbr@squareup.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > An update on this.  It appears that the phenomenon
I'm
> seeing
> > > is
> > > > > that
> > > > > > > > disk
> > > > > > > > > space is freed on restart, but it's not due files
getting
> > > deleted
> > > > > on
> > > > > > > > > restart, but instead files are getting truncated
on
> restart.
> > >  It
> > > > > > > appears
> > > > > > > > > that log files get pre-allocated to a larger
size than is
> > used
> > > > > right
> > > > > > > > away.
> > > > > > > > >  Upon restart, they get truncated to the size
of the file
> > that
> > > > > > actually
> > > > > > > > > contains data.  Does this make sense?
> > > > > > > > >
> > > > > > > > > Before restart, I see a large number of log files
size
> 2.1Gb.
> > > >  Upon
> > > > > > > > > restart, the disk space reclaimed drops to almost
half
> that,
> > on
> > > > > > > average.
> > > > > > > > >
> > > > > > > > > Thoughts?
> > > > > > > > >
> > > > > > > > > Jason
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 23, 2013 at 8:55 PM, Jun Rao <junrao@gmail.com
> >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > I haven't seen this issue before. We do
have ~1K topics
> in
> > > one
> > > > of
> > > > > > the
> > > > > > > > > Kafka
> > > > > > > > > > clusters at LinkedIn.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, May 23, 2013 at 11:05 AM, Jason
Rosenberg <
> > > > > > jbr@squareup.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Yeah, that's what it looks like to
me (looking at the
> > > code).
> > > > >  So,
> > > > > > > I'm
> > > > > > > > > > > guessing it's some os level caching,
resource
> recycling.
> > > >  Have
> > > > > > you
> > > > > > > > ever
> > > > > > > > > > > heard of this happening?  One thing
that might be
> > different
> > > > in
> > > > > my
> > > > > > > > usage
> > > > > > > > > > > from the norm is a relatively large
number of topics
> > (e.g.
> > > > ~2K
> > > > > > > > topics).
> > > > > > > > > > >
> > > > > > > > > > > Jason
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, May 23, 2013 at 7:14 AM, Jun
Rao <
> > junrao@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Jason,
> > > > > > > > > > > >
> > > > > > > > > > > > Kafka closes the handler of all
delete files.
> > Otherwise,
> > > > the
> > > > > > > broker
> > > > > > > > > > will
> > > > > > > > > > > > run out of file handler quickly.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, May 22, 2013 at 10:17
PM, Jason Rosenberg <
> > > > > > > > jbr@squareup.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > So, does this indicate kafka
(or the jvm itself) is
> > not
> > > > > > > > > aggressively
> > > > > > > > > > > > > closing file handles of deleted
files?  Is there a
> > fix
> > > > for
> > > > > > > this?
> > > > > > > > >  Or
> > > > > > > > > > is
> > > > > > > > > > > > > there not likely anything
to be done?  What happens
> > if
> > > > the
> > > > > > disk
> > > > > > > > > fills
> > > > > > > > > > > up
> > > > > > > > > > > > > with file handles for phantom
deleted files?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jason
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, May 22, 2013 at 9:50
PM, Jonathan Creasy <
> > > > > jc@box.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > It isn't uncommon if
a process has an open file
> > > handle
> > > > > on a
> > > > > > > > file
> > > > > > > > > > that
> > > > > > > > > > > > is
> > > > > > > > > > > > > > deleted, the space is
not freed until the handle
> is
> > > > > closed.
> > > > > > > So
> > > > > > > > > > > > restarting
> > > > > > > > > > > > > > the process that has
a handle on the file would
> > cause
> > > > the
> > > > > > > space
> > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > freed
> > > > > > > > > > > > > > also.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > You can troubleshoot
that with lsof.
> > > > > > > > > > > > > > Normally, I see 2-4
log segments deleted every
> hour
> > > in
> > > > my
> > > > > > > > > brokers.
> > > > > > > > > >  I
> > > > > > > > > > > > see
> > > > > > > > > > > > > > log lines like this:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2013-05-23 04:40:06,857
 INFO
> [kafka-logcleaner-0]
> > > > > > > > > log.LogManager -
> > > > > > > > > > > > > > Deleting log segment
00000000035434043157.kafka
> > from
> > > > > > > <redacted
> > > > > > > > > > topic>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > However, it seems like
if I restart the broker, a
> > > > massive
> > > > > > > > amount
> > > > > > > > > of
> > > > > > > > > > > > disk
> > > > > > > > > > > > > > space is freed (without
a corresponding flood of
> > > these
> > > > > log
> > > > > > > > > segment
> > > > > > > > > > > > > deleted
> > > > > > > > > > > > > > messages).  Is there
an explanation for this?
>  Does
> > > > kafka
> > > > > > > keep
> > > > > > > > > > > > reference
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > file segments around,
and reuse them as needed or
> > > > > > something?
> > > > > > > >  And
> > > > > > > > > > > then
> > > > > > > > > > > > or
> > > > > > > > > > > > > > restart, the references
to those free segment
> files
> > > are
> > > > > > > > dropped?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thoughts?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This is with 0.7.2.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jason
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
>
>   Mike Heffner <mike@librato.com>
>   Librato, Inc.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message