lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Garski <mgar...@mac.com>
Subject Re: FST performance difference b/w build() & load()
Date Wed, 03 Jul 2013 20:09:03 GMT
This is using Solr/Lucene 4.3.0.

The test was done with a fresh JVM each time. I have a search component & query parser
that heavily use WFSTCompletionLookups and observed the difference in query performance between
creating a new lookup vs. loading a previously built one at startup. In the case where Solr
was started and the lookup was loaded from a previously saved one and then subsequently trigger
the component to build a new lookup (done periodically to update the weights) the performance
then returns to the same level as if it was initially started by building a new lookup. This
is what leads me to suspect a difference in performance between a lookup created with load
and build.

Admittedly this is not a good measure of the lookup performance between the two cases, and
I will create isolated tests of performance. I just wanted to find out if this is something
I should expect... sounds like it is not expected and I will dig in further.

Thanks,

Michael


On Jul 3, 2013, at 12:37 PM, Michael McCandless <lucene@mikemccandless.com> wrote:

> Which Lucene version?
> 
> A "just built" FST stores its bytes in smallish (64 KB I think) pages,
> while a loaded-from-disk FST uses much larger pages (1 GB I think).
> 
> But I would expect the loaded FST to be faster, not the other way around.
> 
> How much of a speed difference are you seeing?  And are you restarting
> the JVM in between the two runs?  (Ie fresh JVM for the "just built"
> case, and a fresh JVM for the "loaded from disk" case).
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> 
> On Wed, Jul 3, 2013 at 3:31 PM, Michael Garski <mgarski@mac.com> wrote:
>> Hello -
>> 
>> I've observed a noticeable performance difference in looking up completions in a
WFSTCompletionLookup when it is created using the build() method with a TermFreqIterator versus
one that is created by loading a previously saved instance, with the instance created from
the build method being faster. The saved lookup is ~275MB on disk. I have not dug in to determine
why this is, but is this to be expected?
>> 
>> Thanks,
>> 
>> Michael
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message