lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: Function query to boost scores by a constant if all terms are present
Date Tue, 17 Aug 2010 20:48:16 GMT
> Most of the time, items that match all three terms will
> float to the top by
> normal ranking, but sometimes there are only two terms that
> are like a rash
> across the record, and they end up with a higher score than
> some items that
> match all three query terms.
> 
> I'd like to boost items with all the query terms to the top
> *without
> changing their order*.
> 
> My first thought was to use a simple boost query
> allfields:(a AND b AND c),
> but the order of the set of records that contain all three
> terms changes
> when I do that. What I *think* I need to do is basically to
> say, "Hey, all
> the items with all three terms get an extra 40,000 points,
> but change
> nothing else".

This is a hard task, and I am not sure it is possible. But you need to change similarity algorithm
for that. Final score is composed of many factors. coord, norm, tf-idf ... 

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html

May be you can try to customize coord(q,d). But there can be always some cases that you describe.
For example very long document containing three terms will be punished due to its length.
A very short document with two query terms can pop-up before it.

It is easy to "rank items with all three terms" so that they comes first, (omitNorms="true"
and omitTermFreqAndPositions="true" should almost do it) but "change nothing else" part is
not.

Easiest thing can be throw additional query with pure AND operator and display these result
in a special way.


      

Mime
View raw message