lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Aglassinger <>
Subject Inconsistent debugQuery score with multiplicative boost
Date Fri, 04 Jan 2019 08:11:33 GMT

When debugging a query using multiplicative boost based on the product() function I noticed
that the score computed in the explain section is correct while the score in the actual result
is wrong.

As an example here’s a simple query that boosts a field name_text_de (containing German
product names). The term “Netzteil” boost to 200% and “Sony” boosts to 300%. A name
that contains both terms would be boosted to 600%. If a term does not match, a default pseudo
boost of 1 is used (multiplicative identity). The params of the responseHeader in the query
result are:

"q":"{!boost b=$ymb}(+{!lucene v=$yq})",

The parsed query of the ymb parameter translates to:

FunctionScoreQuery(FunctionScoreQuery(+*:*, scored by boost(product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0)))))

For a product that contains both terms, the score in the result and explain section correctly
yields 6.0:

"name_text_de":"Original Sony Vaio Netzteil",

6.0 = product of:
  1.0 = boost
  6.0 = product of:
    1.0 = *:*
    6.0 = product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0)=2.0,query((ConstantScore(name_text_de:sony))^3.0,def=1.0)=3.0)

However, for a product with only “Netzteil” in the name, the result score wrongly is 1.0
while the explain score correctly is 2.0:

"name_text_de":"GS-Netzteil 20W schwarz",

2.0 = product of:
  1.0 = boost
  2.0 = product of:
    1.0 = *:*
    2.0 = product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0)=2.0,query((ConstantScore(name_text_de:sony))^3.0,def=1.0)=1.0)

(Note: the filter chain splits words on hyphen so the “GS-“ in front of the “Netzteil”
should not be an issue.)

Here’s the complete filter chain for the text_de field type:

<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.ManagedSynonymGraphFilterFactory" managed="de" />
        <filter class="solr.ManagedStopFilterFactory" managed="de" />
        <filter class="solr.WordDelimiterGraphFilterFactory"  preserveOriginal="1"
                generateWordParts="1" generateNumberParts="1" catenateWords="1"
                catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.ASCIIFoldingFilterFactory" />
        <filter class="solr.GermanStemFilterFactory" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

Interestingly if I simplify the query to only boost on “Netzteil”, the score in both the
result and explain section are correctly 2.0.

I reproduced this with a local Solr 7.5.0 server (no sharding, no replica) on Mac OS X 10.14.1.

I found mention of a somewhat similar situation with BooleanQuery, which was considered a
bug and fixed in 2016:

So my questions are:

1. Is there something wrong in my query that prevents the “Netzteil”-only product to get
a score of 2.0?
2. Shouldn’t the score in the result and the explain section always be the same?

Best regards,
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message