lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: Isn't fieldLength in BM25 supposed to be an integer?
Date Wed, 09 Nov 2016 20:48:12 GMT
Hi Mossaab,

Probably due to the encodeNormValue/decodeNormValue transformation of the document length.

Please see the aforementioned methods in BM25Similarity.java

Ahmet





On Wednesday, November 9, 2016 10:25 PM, Mossaab Bagdouri <bagdouri_mossaab@yahoo.fr.INVALID>
wrote:
Hi,

On Lucene 6.2.1, I have the following explain output for a document that
contain two words. I'm wondering why the value of fieldLength is not 2.

A related question was posted on S.O. two years ago:
http://stackoverflow.com/questions/22194920

23.637165 = sum of:
  10.065297 = weight(title:googl in 401658357) [BM25Similarity], result of:
    10.065297 = score(doc=401658357,freq=1.0 = termFreq=1.0
), product of:
      7.3866553 = idf(docFreq=414179, docCount=668609139)
      1.3626325 = tfNorm, computed from:
        1.0 = termFreq=1.0
        1.2 = parameter k1
        0.75 = parameter b
        7.3254013 = avgFieldLength
        2.56 = fieldLength
  13.571868 = weight(title:hangout in 401658357) [BM25Similarity], result
of:
    13.571868 = score(doc=401658357,freq=1.0 = termFreq=1.0
), product of:
      9.960035 = idf(docFreq=31592, docCount=668609139)
      1.3626325 = tfNorm, computed from:
        1.0 = termFreq=1.0
        1.2 = parameter k1
        0.75 = parameter b
        7.3254013 = avgFieldLength
        2.56 = fieldLength

Regards,
Mossaab

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message