lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: best way to get the size of an index
Date Fri, 02 Oct 2009 12:38:15 GMT
Phillip Farber wrote:
> Resuming this discussion in a new thread to focus only on this question:
> What is the best way to get the size of an index so it does not get
> too big to be optimized (or to allow a very large segment merge) given
> space limits?
> I already have the largest 15,000rpm SCSI direct attached storage so
> buying storage is not an option.  I don't do deletes.
Even if you did do deletes, its not really a 3x problem - thats just
theory - you'd have to work to get there. Deletes are merged out as you
index additional docs as segments are merged over time. The 3x scenario
brought up is more of a fun mind exercise than anything that would
realistically happen.
> From what I've read, I expect no more than a 2x increase during
> optimization and have not seen more in practice.
> I'm thinking: stop indexing, commit, do a du.
> Will this give me the number I need for what I'm trying to do? Is
> there a better way?
Should work fine. When you do the commit, onCommit will be called on the
IndexDeltionPolicy, and all of the "snapshots" of the index other than
the latest one will be removed. You should have a clean index to gauge
the size with. Using something like Java Replication complicates this
though - in that case, older commit points can be reserved while they
are being copied.
> Phil

- Mark

View raw message