lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr relevancy score different on replicated nodes
Date Wed, 09 Jan 2019 00:42:27 GMT
bq. Shouldn't both replica and leader come to same state
after this much long period.

No. After that long, the docs will be the same, all the docs
present on one replica will be present and searchable on
the other. However, they will be in different segments so the
"stats skew" will remain.

But displaying the scores isn't a good reason to worry about
this. Frankly, that's almost always a mistake. Scores are
meaningless outside of ranking the docs _in a single
query_. Because a doc in one query got a score of 10 but
some other doc in some other query scored 5 doesn't say
anything at all about whether one was "twice as good" as
another. Even within the same query, the same two
scores don't mean one doc is "twice as good".

I think this is a waste of effort frankly. At best, I've seen
UIs where they display, say, 1 to 5 stars that are just
showing the percentile that the particular doc had
_relative to the max score of that query_, unrelated
to any other query.

If you insist (and again I think it's a mistake) you can
optimize periodically, but if you're using anything
earlier than Solr 7.5 that has its own traps and I do
NOT recommend it unless you can do it every time
you change your index. See:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
and
https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/

On Tue, Jan 8, 2019 at 7:28 AM Ashish Bisht <bishtashish77@gmail.com> wrote:
>
> Thank you Erick for explaining.
>
> In my senario, I stopped indexing and updates too and waited for 1 day.
> Restarted solr too.Shouldn't both replica and leader come to same state
> after this much long period. As you said this gets corrected by segment
> merging, hope it is internal process itself and no manual activity required.
>
> For us score matters as we are using it to display some scenarios on search
> and it gave changing values.As of now we are dependent of single
> shard-replica but in future we might need more replicas
> Will planning indexing and updates outside peak query hour help?
>
> I have tried the exact cache while debugging score difference during
> sharding.Didn't help much.Anyhow that's a different topic.
>
> Thanks again,
>
> Regards
> Ashish Bisht
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Mime
View raw message