lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <>
Subject Re: flushRamSegments() is "over merging"?
Date Wed, 16 Aug 2006 22:53:42 GMT
On 8/16/06, Doron Cohen <> wrote:
> Under-merging would hurt search, unless optimize is called explicitly, but
> the index should "behave" without requiring the user to call optimize. 388
> deals with this.

Depends on what you mean by "behave" :-)
More segments than expected can cause failure because of file
descriptor exhaustion.  It's nice to have a calculable cap on the
number of segments. It also depends on exactly what one thinks the
index invariants should be w.r.t. mergeFactor.

> Over-merging - in current flushRamSegments() code - would merge at most
> merge-factor documents prematurely.


>  Since merge-fatcor is usually not very
> large, this might be a minor issue - but still, if an index is growing by
> small doses, does it make sense to re-merge with the last disk segment each
> time the index is closed? Why not letting it be simply controlled by
> maybeMergeSegments?

I personally see mergeFactor as the maximum number of segments at any
level in the index, with level defined by

maybeMergeSegments doesn't enforce this in the presence of partially
filled segments because it counts documents and not segments.  Since
partially filled segments aren't written in a single IndexWriter
session, this only needs to be checked for on a close().

-Yonik Solr, the open-source Lucene search server

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message