lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Facet counts and RankQuery
Date Tue, 21 Oct 2014 14:50:09 GMT
I _very strongly_ recommend that you do _not_ do this.

First, the "problem" of having documents in the results
list with, say, scores < 20% of the max takes care of itself;
users stop paging pretty quickly. You're arbitrarily
denying the users any chance of finding some documents
that _do_ match their query. A user may know that a
doc is in the corpus but be unable to find it. Very bad from
a confidence-building standpoint.

I've seen people put, say, 1-5 stars next to docs in the result
to give the user some visual cue that they're getting into "less
good" matches, but even that is of very limited value IMO. The
stars represent quintiles, 5 stars for docs > 80% of max, 4
stars between 60% and 80% etc.

If you insist on this, then you'll need to run two passes
across the data, the first will get the max score and the second
will have a custom collector that somehow gets this number
and rejects any docs below the threshold.

Bet,
Erick

On Tue, Oct 21, 2014 at 3:09 AM, Parvesh Garg <parvesh@zettata.com> wrote:
> Hi All,
>
> We have written a RankQuery plugin with a custom TopDocsCollector to
> suppress documents below a certain threshold w.r.t. to the maxScore for
> that query. It works fine and is reflected well with numFound and start
> parameters.
>
> Our problem lies with facet counts. Even though the docs numFound gives a
> very less number, the facet counts are still coming from unsuppressed query
> results.
>
> E.g. in a test with a threshold of 20% , we reduced the totalDocs from
> 46030 to 6080 but the top facet count on a field is still 20500
>
> The query parameter we are using looks like rq={!threshold value=0.2}
>
> Is there a way propagate the suppression of results to FacetsComponent as
> well? Can we send the same rq to FacetsComponent ?
>
>
>
> Regards,
> Parvesh Garg,
>
> http://www.zettata.com

Mime
View raw message