lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Morten B√łgeskov ...@dbc.dk>
Subject SolrCloud different score for same document on different replicas.
Date Thu, 05 Jan 2017 13:30:58 GMT


Hi.

We've got a SolrCloud which is sharded and has a replication factor of
2.

The 2 replicas of a shard may look like this:

Num Docs:    5401023
Max Doc:    6388614
Deleted Docs:    987591


Num Docs:    5401023
Max Doc:    5948122
Deleted Docs:    547099

We've seen >10% difference in Max Doc at times with same Num Docs.
Our use case is few documents that are search and many small that
are filtered against (often updated multiple times a day), so the
difference in deleted docs aren't surprising.

This results in a different score for a document depending on which
replica it comes from. As I see it: it has to do with the different
maxDoc value when calculating idf.

This in turn alters a specific document's position in the search
result over reloads. This is quite confusing (duplicates in pagination).

What is the trick to get homogeneous score from different replicas.
We've tried using ExactStatsCache & ExactSharedStatsCache, but that
didn't seem to make any difference.

Any hints to this will be greatly appreciated.

-- 
 Morten B√łgeskov <mb@dbc.dk>


Mime
View raw message