lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Brusic <>
Subject Re: Slow merging after upgrading to 3.5
Date Thu, 05 Apr 2012 19:31:38 GMT
Hi Mike,

Response inline:

On Thu, Apr 5, 2012 at 11:36 AM, Michael McCandless
<> wrote:
> I'm assuming this is a "build once and never change" index...?  Else,
> it sounds like you should never run forceMerge...

Correct. The forceMerge was merely to preserve the previous 2.3
behavior of using optimize.

> To preserve insertion order you just need to use one of the
> Log*MergePolicy (which you are already doing).  Merge factor doesn't
> affect this...

I was never sure why the merge factor was set to 2. My experiences in
the past was to set a high merge factor when doing a batch index.

> For the fastest way to get to a single-segment index.... use
> NoMergePolicy while indexing the documents, and set the largest RAM
> buffer you can afford.  This will create tons of segments in the index
> dir, which is fine as long as you will not open a reader on it...
> then:
> Open a new IW, with Log*MergePolicy, set a highish (maybe 30)
> mergeFactor, and call forceMerge(1).  You may need to cutover to
> SerialMergeScheduler...

NoMergePolicy? Never seen that class used before. RAM buffer size is
not an issue. Is the limitation still 2048MB?

Is the fastest way also the best way? :) There will never be a read
open on the index. Your second solution is similar to the existing
code with the exception of the mergeFactor. Will setting the merge
factor to a more reasonable number help with the merge speed?

What enforces the preservation of the insertion order? The
MergePolicy? How does the MergeScheduler affect things?  Used Lucene
on a few projects over the years and I never had to tweak the index
creation. I guess I need to reread the tuning chapter in LIA, it's
been a few years.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message