lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron M VandenBerg <>
Subject Distributed IDF for Solr using ExactStatsCache issue
Date Tue, 16 Mar 2021 19:55:05 GMT

I am using Solr in a distributed environment where I have split my collection into parts,
which I have running on different nodes.  When I create each part of the collection, I set
numShards and replicationFactor to 1.  The query speed is most important to us, and we are
not worried about load on the system.

I want a Distributed IDF across all parts of the collection so I have added this line to my
<statsCache class="" />

This seems to work about 90% of the time, but if I run the same request over and over again,
sometimes I get scores with a local IDF for just one part of the collection.  Here is a request

I still get documents from both collection1 and collection2, but sometimes I get scores that
are the same as when I would just query collection1.  I believe that it is only using the
document frequency of collection one for the term in that case.

Should I use a different configuration?  I would like to make sure the IDF is always distributed
and the same every time I run the same query.  Is there any technique I could use to ensure
that this happens?

Thank you,
Cameron VandenBerg

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message