lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (SOLR-9152) Change the default of facet.distrib.mco from false to true
Date Thu, 26 May 2016 04:50:12 GMT


David Smiley commented on SOLR-9152:

+1 to flip the default!

> Change the default of facet.distrib.mco from false to true
> ----------------------------------------------------------
>                 Key: SOLR-9152
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Dennis Gove
>            Priority: Minor
> SOLR-8988 added a new query option facet.distrib.mco which when set to true would allow
the use of facet.mincount=1 in cloud mode. The previous behavior, and current default, is
that facet.mincount=0 when in cloud mode. 
> h3. What exactly would be changed?
> The default of facet.distrib.mco=false would be changed to facet.distrib.mco=true.
> h3. When is this option effective?
> From the documentation,
> {code}
> /**
>  * If we are returning facet field counts, are sorting those facets by their count, and
the minimum count to return is &gt; 0,
>  * then allow the use of facet.mincount = 1 in cloud mode. To enable this use facet.distrib.mco=true.
>  *
>  * i.e. If the following three conditions are met in cloud mode: facet.sort=count, facet.limit
&gt; 0, facet.mincount &gt; 0.
>  * Then use facet.mincount=1.
>  *
>  * Previously and by default facet.mincount will be explicitly set to 0 when in cloud
mode for this condition.
>  * In SOLR-8599 and SOLR-8988, significant performance increase has been seen when enabling
this optimization.
>  *
>  * Note: enabling this flag has no effect when the conditions above are not met. For
those other cases the default behavior is sufficient.
>  */
> {code}
> h3. What is the result of turning this option on?
> When facet.distrib.mco=true is used, and the conditions above are met, then when Solr
is sending requests off to the various shards it will include facet.mincount=1. The result
of this is that only terms with a count > 0 will be considered when processing the request
for that shard. This can result in a significant performance gain when the field has high
cardinality and the matching docset is relatively small because terms with 0 matches will
not be considered. 
> As shown in SOLR-8988, the runtime of a single query was reduced from 20 seconds to less
than 1 second.
> h3. Can this change result in worse performance?
> The current thinking is no, worse performance won't be experienced even under non-optimal
scenarios. From the comments in SOLR-8988, 
> {quote}
> Consider you asked for up to 10 terms from shardA with mincount=1 but you received only
5 terms back. In this case you know, definitively, that a term seen in the response from shardB
but not in the response from shardA could have at most a count of 0 in shardA. If it had any
other count in shardA then it would have been returned in the response from shardA.
> Also, if you asked for up to 10 terms from shardA with mincount=1 and you get back a
response with 10 terms having a count >= 1 then the response is identical to the one you'd
have received if mincount=0. 
> Because of this, there isn't a scenario where the response would result in more work
than would have been required if mincount=0. For this reason, the decrease in required work
when mincount=1 is *always* either a moot point or a net win.
> {quote}
> The belief here is that it is safe to change the default of facet.distrib.mco such that
facet.mincount=1 will be used when appropriate. The overall performance gain can be significant
and there is no seen performance cost.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message