lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phillip Farber <>
Subject index size before and after commit
Date Thu, 01 Oct 2009 12:54:51 GMT
I am trying to automate a build process that adds documents to 10 shards 
over 5 machines and need to limit the size of a shard to no more than 
200GB because I only have 400GB of disk available to optimize a given shard.

Why does the size (du) of an index typically decrease after a commit?  
I've observed a decrease in size of as much as from 296GB down to 151GB 
or as little as from 183GB to 182GB.  Is that size after a commit close 
to the size the index would be after an optimize?  For that matter, are 
there cases where optimization can take more than 2x?  I've heard of 
cases but have not observed them in my system.  I only do adds to the 
shards, never query them. An LVM snapshot of the shard receives the queries.

Is doing a commit before I take a du a reliable way to gauge the size of 
the shard?  It is really bad news to allow a shard to go over 200GB in 
my use case.  How do others manage this problem of 2x space needed to 
optimize with "limited" dosk space?

Advice greatly appreciated.


View raw message