lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Garski <>
Subject Re: FST performance difference b/w build() & load()
Date Wed, 03 Jul 2013 20:09:03 GMT
This is using Solr/Lucene 4.3.0.

The test was done with a fresh JVM each time. I have a search component & query parser
that heavily use WFSTCompletionLookups and observed the difference in query performance between
creating a new lookup vs. loading a previously built one at startup. In the case where Solr
was started and the lookup was loaded from a previously saved one and then subsequently trigger
the component to build a new lookup (done periodically to update the weights) the performance
then returns to the same level as if it was initially started by building a new lookup. This
is what leads me to suspect a difference in performance between a lookup created with load
and build.

Admittedly this is not a good measure of the lookup performance between the two cases, and
I will create isolated tests of performance. I just wanted to find out if this is something
I should expect... sounds like it is not expected and I will dig in further.



On Jul 3, 2013, at 12:37 PM, Michael McCandless <> wrote:

> Which Lucene version?
> A "just built" FST stores its bytes in smallish (64 KB I think) pages,
> while a loaded-from-disk FST uses much larger pages (1 GB I think).
> But I would expect the loaded FST to be faster, not the other way around.
> How much of a speed difference are you seeing?  And are you restarting
> the JVM in between the two runs?  (Ie fresh JVM for the "just built"
> case, and a fresh JVM for the "loaded from disk" case).
> Mike McCandless
> On Wed, Jul 3, 2013 at 3:31 PM, Michael Garski <> wrote:
>> Hello -
>> I've observed a noticeable performance difference in looking up completions in a
WFSTCompletionLookup when it is created using the build() method with a TermFreqIterator versus
one that is created by loading a previously saved instance, with the instance created from
the build method being faster. The saved lookup is ~275MB on disk. I have not dug in to determine
why this is, but is this to be expected?
>> Thanks,
>> Michael
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message