lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Filter cache pollution during sharded edismax queries
Date Wed, 01 Oct 2014 17:17:41 GMT

: +1 for using a different cache, but that's being quite unfamiliar with the
: code.

in (a) common case, people tend to "drill down" and filter on facet 
constraints -- so using a special purpose cache for the refinements would 
result in redundent caching of the same info in multiple places.

: > > What's the point to refine these counts? I've thought that it make sense
: > > only for facet.limit ed requests. Is it correct statement? can those who

refinement only happens if facet.limit is used and there are eligable 
"top" constraints that were not returned by some shards.  

: > > suffer from the low performance, just unlimit  facet.limit to avoid that
: > > distributed hop?

As noted, setting facet.limit=-1 might help for low cardinality fields to 
ensure that every shard returns a count for every value and no-refinement 
is needed, but that doesn't really help you for fields with 
unknown/unbounded cardinality.

As part of the distributed pivot faceting work, the amount of 
"overrequest" done in phase 1 (for both facet.pivot & facet.field) was 
made configurable via 2 new parameters...

https://lucene.apache.org/solr/4_10_0/solr-solrj/org/apache/solr/common/params/FacetParams.html#FACET_OVERREQUEST_RATIO
https://lucene.apache.org/solr/4_10_0/solr-solrj/org/apache/solr/common/params/FacetParams.html#FACET_OVERREQUEST_COUNT

...so depending on the distribution of your data, you might find that by 
adjusting those values to increase the amount of overrequesting done, you 
can decrease the amount of refinement needed -- but there are obviously 
tradeoffs.



-Hoss
http://www.lucidworks.com/

Mime
View raw message