lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phillip Farber <>
Subject Re: best way to get the size of an index
Date Fri, 02 Oct 2009 21:23:35 GMT
Thanks, Mark. I really appreciate your confirmation.


Mark Miller wrote:
> Phillip Farber wrote:
>> Resuming this discussion in a new thread to focus only on this question:
>> What is the best way to get the size of an index so it does not get
>> too big to be optimized (or to allow a very large segment merge) given
>> space limits?
>> I already have the largest 15,000rpm SCSI direct attached storage so
>> buying storage is not an option.  I don't do deletes.
> Even if you did do deletes, its not really a 3x problem - thats just
> theory - you'd have to work to get there. Deletes are merged out as you
> index additional docs as segments are merged over time. The 3x scenario
> brought up is more of a fun mind exercise than anything that would
> realistically happen.
>> From what I've read, I expect no more than a 2x increase during
>> optimization and have not seen more in practice.
>> I'm thinking: stop indexing, commit, do a du.
>> Will this give me the number I need for what I'm trying to do? Is
>> there a better way?
> Should work fine. When you do the commit, onCommit will be called on the
> IndexDeltionPolicy, and all of the "snapshots" of the index other than
> the latest one will be removed. You should have a clean index to gauge
> the size with. Using something like Java Replication complicates this
> though - in that case, older commit points can be reserved while they
> are being copied.
>> Phil

View raw message