lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: How to read values of a field efficiently
Date Mon, 30 Jul 2007 07:30:21 GMT

: Is it possible to get the values from the ValueSource (or from
: getFieldCacheCounts) sorted by its natural order (from lowest to
: highest values)?

well, an inverted term index is already a data structure listing terms
from lowest to highest and the associated documents -- so if you want to
iterate from low to high between a range and find matching docs you should
just use hte TermEnum -- the whole point of the FieldCache (and
FieldCacheSource) is to have a "reverse inverted index" so you can quickly
fetch the indexed value if you know the docId.

perhaps you should elaborate a little more on what it is you are trying to
do so we can help you figure out how to do it more efficinelty ... i know
you mentioend computing price ranges in your first message ... but you
also didn't post any clear code about that part of your problem, just that
the *other* part of your code that iterated over every doc was too slow
... perhaps you shouldn't be iterating over every doc to figure out your
ranges .. perhaps you can iterate over the terms themselves?


hang on ... rereading your first message i just noticed something i
definitely didn't spot before...

>> Fairly long: getFieldCacheCounts for the cat field takes ~70 ms
>> for the second request, while reading prices takes ~600 ms.

...i clearly missed this, and fixated on your assertion that your reading
of field values took longer then the stock methods -- but you're not just
comparing the time needed byu different methods, you're also timing
different fields.

this actually makes a lot of sense since there are probably a lot fewer
unique values for the cat field, so there are a lot fewer discrete values
to deal with when computing counts.




-Hoss


Mime
View raw message