Hi all,

I looked into this today. I can reproduce it and I believe it's a bug. 
This is caused by the following working together:
- LUCENE-7386 Flatten nested disjunctions
- LUCENE-7925 Deduplicate SHOULD and MUST clauses in BooleanQuery

Blended term queries modify the df/ttf of their terms to make sure all terms produce identical scores. In this case, two blended term queries contain a few terms each, only some of which overlap. The two queries calculate different df/ttf for their terms respectively, since the two sets are different. During the rewrite process,
  1. the two Blended queries get rewritten as Boolean queries themselves, with each (modified) TermQuery as a SHOULD clause
  2. the nested Boolean queries get flattened, since they are nested disjunctions
  3. the Term queries (some of which are actually Boost queries) are deduplicated, with one of the two TermQuery and its modified TermStates being picked at random (the randomness is due to the HashSet underlying Lucene's MultiSet).
I haven't managed to create a failing test yet, I'll share it when I have one ready.
If anybody has suggestions or pointers on how this should be fixed, I'm also happy to provide a patch - I'm just a bit clueless what the right thing to do would be here: I have a feeling (2.) should not happen for (rewritten) Blended Queries?

Cheers,
Michele


On Tue, Mar 3, 2020 at 7:55 PM Fiona Hasanaj <fiona@basistech.com> wrote:
Hello,

I’m Fiona with Basis Technology. We’re investigating what we believe to be a bug involving inconsistent query results. We have binary searched this issue and found that it specifically appears when flattening nested disjunctions was introduced with the merge of LUCENE-7386. In order to reproduce the issue, I have attached a Lucene index built in Lucene 8.1.0 as names_index.tar.gz and if you run the attached Java class (LuceneSearchIndex.java) multiple times against Lucene 8.0.0 you'll see the max_score is the same between runs whereas if you run it against Lucene 8.1.0 you'll see inconsistent max_score between runs (try a max of 10 runs and you should be able to see that sometimes it returns max_score of 1.8651859 and sometimes 2.1415303). 

From debugging in Lucene 8.1.0, the query against the name index before flattening its nested disjunctions looks like below:

(((bt_rni_name_encoded_1:ALFR)^0.75 bt_rni_name_encoded_1:ALTR (bt_rni_name_encoded_1:ANTR)^0.75 (bt_rni_name_encoded_1:LTR)^0.6666666) ((bt_rni_name_encoded_1:ALTR)^0.75 (bt_rni_name_encoded_1:FLTMR)^0.75 (bt_rni_name_encoded_1:FLTRN)^0.75 (bt_rni_name_encoded_1:FLTS)^0.75 (bt_rni_name_encoded_1:FTR)^0.6666666 (bt_rni_name_encoded_1:LTR)^0.6666666)) | (((bt_rni_name_encoded_2:FLTR)^0.75) (bt_rni_name_encoded_2:FLTR (bt_rni_name_encoded_2:FLTRN)^0.75))

The term that's causing the difference in the final score is bt_rni_name_encoded_1:ALTR and as we can see in the above query, it shows twice nested under different clauses: in the first clause that it occurs the docFreq for it is 3, and for the same term but in the second clause that it appears in, its docFreq is 2. This happens in Lucene 8.0.0 as well; is a term being read with different docFreq values expected behaviour? 

After flattening the nested disjunctions (part of query rewrite process), the query looks like below:

((bt_rni_name_encoded_1:FTR)^0.6666666 (bt_rni_name_encoded_1:FLTRN)^0.75 (bt_rni_name_encoded_1:FLTMR)^0.75 (bt_rni_name_encoded_1:ALFR)^0.75 (bt_rni_name_encoded_1:FLTS)^0.75 (bt_rni_name_encoded_1:ANTR)^0.75 (bt_rni_name_encoded_1:LTR)^1.3333333 (bt_rni_name_encoded_1:ALTR)^1.75) | ((bt_rni_name_encoded_2:FLTRN)^0.75 (bt_rni_name_encoded_2:FLTR)^1.75)

As you can see, bt_rni_name_encoded_1:ALTR shows only once, but the weight has been summed up from the original query. This is the version of the query that actually gets used, and the docFreq here for the bt_rni_name_encoded_1:ALTR term sometimes it shows as 3 and sometimes it shows as 2 between runs and final score changes accordingly to that. Is this "coin toss" pick of docFreq for the same term expected behaviour? 

Looks like the issue stems from one of the behaviours observed and highlighted in bold. 

Looking forward to hearing back from you.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org