lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shawn Heisey (JIRA)" <>
Subject [jira] [Commented] (SOLR-12952) TFIDF scorer uses max docs instead of num docs when using Edismax
Date Thu, 01 Nov 2018 12:46:00 GMT


Shawn Heisey commented on SOLR-12952:

Lucene scoring has always been influenced by deleted docs.  SolrCloud's NRT replica type has
always had the potential for this problem, because different replicas having different numbers
of deleted documents has always been a possibility.  If you use TLOG/PULL replica types (new
in 7.x) then all replicas will be absolutely identical, and this can't happen.

It is not a bug.  I don't know if eliminating the influence of deleted documents on the scores
in a query is even *possible*.  Attempting to do it would kill performance.

> TFIDF scorer uses max docs instead of num docs when using Edismax
> -----------------------------------------------------------------
>                 Key: SOLR-12952
>                 URL:
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: mosh
>            Priority: Major
> I have recently noticed some odd behavior while using the edismax query parser.
> The scores returned by documents seem to be affected by deleted documents, which have
yet to be merged and completely removed from the index.
> This causes different shards to return different scores for the same query.
> Is this a bug, or am I missing something?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message