Yeah, that mergeFactor is way too high and will cause
too-many-open-files (if the index has enough segments).
Also, you should setRamBufferSizeMB instead of maxBufferedDocs, for
faster index throughput.
Calling optimize from two threads doesn't help it run faster when
using ConcurrentMergeScheduler (the default). Ie with CMS, optimize
simply waits for CMS to perform all the necessary merges.
Mike
http://blog.mikemccandless.com
On Mon, Apr 4, 2011 at 4:06 PM, Simon Willnauer
<simon.willnauer@googlemail.com> wrote:
> On Mon, Apr 4, 2011 at 9:59 PM, Paul Taylor <paul_t100@fastmail.fm> wrote:
>> On 04/04/2011 20:13, Michael McCandless wrote:
>>>
>>> How are you merging these indices? (IW.addIndexes?).
>>>
>>> Are you changing any of IW's defaults, eg mergeFactor?
>>>
>>> Mike
>>
>> Hi Mike
>>
>> I have
>>
>> indexWriter.setMaxBufferedDocs(10000);
>> indexWriter.setMergeFactor(3000);
>>
>
> I didn't read though the entire email but MergeFactor 3000 doesn't
> look right at all. You should try something between 5 and 30. I
> haven't seen a mergeFactor > 50 doing any good in a common env. Why do
> you use such a large factor?
>
> simon
>> these are a hangover from earlier code, I tried changing them and it didnt
>> seem to make any difference, but do they look wrong ?
>>
>> The index that falls over during optmizationcreates about 10,000,000 records
>>
>> What I do is build 10 different indexes sequentially one after the after (so
>> only one is hitting the db at any one time) but once the index is built I
>> optimize the index in the background
>> by creating an instance of the following class and submitting it to an
>> ExecutionService , configured to have at most 2 threads acting
>>
>> I built my own class rather than just using optimize() with the background
>> option because that wouldnt allow me to do the necessary
>> debugging/calculations
>>
>>
>> static class IndexWriterOptimizerAndClose implements Callable<Boolean>
>> {
>> private int maxId;
>> private IndexWriter indexWriter;
>> private DatabaseIndex index;
>> private IndexOptions options;
>>
>> /**
>> *
>> * @param maxId
>> * @param indexWriter
>> * @param index
>> * @param options
>> */
>> public IndexWriterOptimizerAndClose(int maxId, IndexWriter
>> indexWriter, DatabaseIndex index, IndexOptions options)
>> {
>> this.maxId=maxId;
>> this.indexWriter= indexWriter;
>> this.index=index;
>> this.options=options;
>> }
>>
>> public Boolean call() throws IOException, SQLException
>> {
>>
>> StopWatch clock = new StopWatch();
>> clock.start();
>> String path = options.getIndexesDir() + index.getFilename();
>> System.out.println(index.getName()+":Started Optimization at
>> "+Utils.formatCurrentTimeForOutput());
>>
>> indexWriter.optimize();
>> indexWriter.close();
>> clock.stop();
>> // For debugging to check sql is not creating too few/many rows
>> if(true) {
>> int dbRows = index.getNoOfRows(maxId);
>> IndexReader reader = IndexReader.open(FSDirectory.open(new
>> File(path)),true);
>> System.out.println(index.getName()+":"+dbRows+" db
>> rows:"+(reader.maxDoc() - 1)+" lucene docs");
>> reader.close();
>> }
>> System.out.println(index.getName()+":Finished Optimization:" +
>> Utils.formatClock(clock));
>>
>> return true;
>> }
>> }
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|