lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (JIRA)" <>
Subject [jira] [Commented] (SOLR-8934) Spellchecker collaction should return in popular order
Date Tue, 05 Apr 2016 14:24:25 GMT


James Dyer commented on SOLR-8934:

The collator does not check the queries unless "spellcheck.maxCollationTries" is specified.
 Also, if you specify "spellcheck.collateExtendedResults", it returns the # of hits for each
collation.  If returning multiple collations, you can sort them with your client on the #
of hits.

The limitation here is the collate functionality quits once it finds as many collations as
you requested, or if it runs out of combinations to try.  But as it is expensive for it to
keep running these queries, we don't want to to spend (much) extra time finding the most results.
 But you can force it to look for more by requesting more collations returned.

Also, you need to think about whether or not # of hits is a good predictor of relevance. 
Here, the spellchecker is trying the terms that have the closest edit distance from the user's
query terms.  So lower hits with fewer edits often will have better relevance than more hits
and more edits.

With all this in mind, do you still see something here that should be done?  What is the bug
or new feature you think we need?

> Spellchecker collaction should return in popular order
> ------------------------------------------------------
>                 Key: SOLR-8934
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: spellchecker
>    Affects Versions: 5.5.1
>            Reporter: Michael Solomon
>            Priority: Minor
> From what I understand solr execute queries to determine if the suggest return results.
> []
> {quote}
> The spellcheck.collate parameter only returns collations that are guaranteed to result
in hits if re-queried, even when applying original fq parameters.
>  it would be great if solr will order the collations by numFound, so the collations with
more results appear first.
> {quote}
> i.e:
> spellcheck.q = prditive analytiycs
> spellcheck.maxCollations = 5
> spellcheck.count=0
> response:
> {code:xml}
> <lst name="spellcheck">
>   <lst name="suggestions"/>
>   <bool name="correctlySpelled">false</bool>
>   <lst name="collations">
>     <str name="collation">positive analytic</str>
>     <str name="collation">positive analytics</str>
>     <str name="collation">predictive analytics</str>
>     <str name="collation">primitive analytics</str>
>     <str name="collation">punitive analytic</str>
>   </lst>
> </lst>
> {code}
> Obviesly that "predictive analytics" have more results from "positive analytic".

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message