lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Is negative boost possible?
Date Mon, 12 Oct 2009 09:58:04 GMT
Yonik Seeley wrote:
> On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog <goksron@gmail.com> wrote:
>> And the other important
>> thing to know about boost values is that the dynamic range is about
>> 6-8 bits
> 
> That's an index-time boost - an 8 bit float with 5 bits of mantissa
> and 3 bits of exponent.
> Query time boosts are normal 32 bit floats.

To be more specific: index-time float encoding does not permit negative 
numbers (see SmallFloat), but query-time boosts can be negative, and 
they DO affect the score - see below. BTW, standard Collectors collect 
only results with positive scores, so if you want to collect results 
with negative scores as well then you need to use a custom Collector.

-----------------------------------------------
BeanShell 2.0b4 - by Pat Niemeyer (pat@pat.net)
bsh % import org.apache.lucene.search.*;
bsh % import org.apache.lucene.index.*;
bsh % import org.apache.lucene.store.*;
bsh % import org.apache.lucene.document.*;
bsh % import org.apache.lucene.analysis.*;
bsh % tq = new TermQuery(new Term("a", "b"));
bsh % print(tq);
a:b
bsh % tq.setBoost(-1);
bsh % print(tq);
a:b^-1.0
bsh % q = new BooleanQuery();
bsh % tq1 = new TermQuery(new Term("a", "c"));
bsh % tq1.setBoost(10);
bsh % q.add(tq1, BooleanClause.Occur.SHOULD);
bsh % q.add(tq, BooleanClause.Occur.SHOULD);
bsh % print(q);
a:c^10.0 a:b^-1.0
bsh % dir = new RAMDirectory();
bsh % w = new IndexWriter(dir, new WhitespaceAnalyzer());
bsh % doc = new Document();
bsh % doc.add(new Field("a", "b c d", Field.Store.YES, 
Field.Index.ANALYZED));
bsh % w.addDocument(doc);
bsh % w.close();
bsh % r = IndexReader.open(dir);
bsh % is = new IndexSearcher(r);
bsh % td = is.search(q, 10);
bsh % sd = td.scoreDocs;
bsh % print(sd.length);
1
bsh % print(is.explain(q, 0));
0.1373985 = (MATCH) sum of:
   0.15266499 = (MATCH) weight(a:c^10.0 in 0), product of:
     0.99503726 = queryWeight(a:c^10.0), product of:
       10.0 = boost
       0.30685282 = idf(docFreq=1, numDocs=1)
       0.32427183 = queryNorm
     0.15342641 = (MATCH) fieldWeight(a:c in 0), product of:
       1.0 = tf(termFreq(a:c)=1)
       0.30685282 = idf(docFreq=1, numDocs=1)
       0.5 = fieldNorm(field=a, doc=0)
   -0.0152664995 = (MATCH) weight(a:b^-1.0 in 0), product of:
     -0.099503726 = queryWeight(a:b^-1.0), product of:
       -1.0 = boost
       0.30685282 = idf(docFreq=1, numDocs=1)
       0.32427183 = queryNorm
     0.15342641 = (MATCH) fieldWeight(a:b in 0), product of:
       1.0 = tf(termFreq(a:b)=1)
       0.30685282 = idf(docFreq=1, numDocs=1)
       0.5 = fieldNorm(field=a, doc=0)

bsh %


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Mime
View raw message