lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <yo...@lucidimagination.com>
Subject Re: Questions on FieldValueCache
Date Mon, 03 Aug 2009 18:43:51 GMT
On Mon, Aug 3, 2009 at 2:18 PM, Stephen Duncan
Jr<stephen.duncan@gmail.com> wrote:
> On Fri, Jul 31, 2009 at 5:23 PM, Yonik Seeley <yonik@lucidimagination.com>wrote:
>
>> On Fri, Jul 31, 2009 at 5:06 PM, Stephen Duncan
>> Jr<stephen.duncan@gmail.com> wrote:
>> > I have a couple more questions on the FieldValueCache.  I see that the
>> > number of items in the cache is basically the number of multi-valued
>> fields
>> > facets have been requested for.  What does each entry in the cache
>> actually
>> > contain?  How does it's size grow as the number of total documents
>> > increases?
>>
>> It's basically an array of int[maxDoc] that contain the list of
>> delta-coded vint values or optionally point out to shared byte arrays
>> if the list of values don't fit in an int.  See the javadoc for
>> UnInvertedField for more details.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>
> I'm getting the following warning in my logs: 2009-08-03 13:41:40,114
> [http-127.0.0.1-8080-1] WARN  org.apache.solr.core.SolrCore - Approaching
> too many values for UnInvertedField faceting on field 'originaltext' :
> bucket size=15802492
>
> What's the impact of that?  If the number of values (number of unique terms
> for that field, or some other "values"?) exceeds that limit, will faceting
> for that field go back to a different technique and be slower, or...?

It will throw an exception.

This method of faceting wasn't really designed for big full-text fields.
The enum method should work better for this... try something like the following:

f.originaltext.facet.method=enum
facet.enum.cache.minDf=10000

-Yonik
http://www.lucidimagination.com

Mime
View raw message