lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Balmain" <>
Subject Re: Ferret's changes
Date Wed, 11 Oct 2006 01:48:37 GMT
On 10/11/06, Yonik Seeley <> wrote:
> On 10/10/06, David Balmain <> wrote:
> > The start of my benchmarks are here:
> >
> >
> >
> > I did set maxBufferedDocs to 1000 and optimized both indeces at the
> > end
> Ah, I had missed that link last time....Is the current code up-to-date?
> The lucene version is is using maxBufferedDocs=1000, while it looks
> like the Ferret version is using 20,000.  Given that the corpus is
> only 19,043 documents, the Ferret optimize would be a no-op since it's
> already a single segment.?
> Also, the ferret merge factor is set to 100, while java-lucene is
> unset (defaults to 10).  That will matter if maxBufferedDocs is
> lowered in Ferret.

Sure, I'll give these changes a try.

> If maximum indexing speed is really the goal, I'd also expect the
> non-compound file format to be used.  A bigger corpus (more docs, not
> larger size) would also be welcome.
> What is the effect of    :max_buffer_memory => 0x10000000 ?

Ferret keeps indexing in Memory until it reaches this memory limit. It
then dumps to disk. That is why I can safely set max_buffered_docs to
whatevery I want in Ferret. If I do the same in Lucene I risk running
out of memory. Anyway, I'll try with a much larger corpus and see what
results I get.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message