lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: optimization recommandation
Date Fri, 27 May 2016 08:59:56 GMT
If you have many deletes on the index (not typical for a time-based index)
then forceMerge (or just forceMergeDeletes) will reclaim disk space.

Fewer file handles will be needed to open the index.

Some searches may be faster, but you should test in your case if that's
really the case.  Much progress has been made over time making queries
faster on multiple segments.

maxNumSegments is really your knob to turn :)  Smaller numbers = more work
to force merge but better possible gains.

Mike McCandless

http://blog.mikemccandless.com

On Fri, May 27, 2016 at 4:10 AM, Vincent Sevel <v.sevel@lombardodier.com>
wrote:

> Hello,
> I am using indexes that can be as large as 25 Gb.
> Indexes are created for a specific time window (for instance it can be
> weekly based).
> Once the week is passed they are not written to anymore.
> I have seen the IndexWriter.forceMerge(int) operation, and I had several
> questions:
>
> -       After a forceMerge, what gains should we expect? Less space on
> disk, or faster searches?
>
> -       How many maxNumSegments should I target? 1, 5, 10, ...? What are
> the criterias to decide?
> Thanks,
> vince
>
> ************************ DISCLAIMER ************************
> This message is intended only for use by the person to
> whom it is addressed. It may contain information that is
> privileged and confidential. Its content does not constitute
> a formal commitment by Bank Lombard Odier & Co Ltd or any
> of its branches or affiliates. If you are not the intended recipient
> of this message, kindly notify the sender immediately and
> destroy this message. Thank You.
> *****************************************************************
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message