lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl / Cominvent <jan....@cominvent.com>
Subject Re: Function query to boost scores by a constant if all terms are present
Date Wed, 18 Aug 2010 10:32:54 GMT
You can use the map() function for this, see http://wiki.apache.org/solr/FunctionQuery#map

q=a fox&defType=dismax&qf=allfields&bf=map(query($qq),0,0,0,100.0)&qq=allfields:(quick
AND brown AND fence)

This adds a constant boost of 100.0 if the $qq field returns a non-zero score, which it does
whenever all three terms match.

PS: You can achieve the same in a Lucene query, using q=a fox _val_:"map(query($qq),0,0,0,100.0)"

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 17. aug. 2010, at 22.48, Ahmet Arslan wrote:

>> Most of the time, items that match all three terms will
>> float to the top by
>> normal ranking, but sometimes there are only two terms that
>> are like a rash
>> across the record, and they end up with a higher score than
>> some items that
>> match all three query terms.
>> 
>> I'd like to boost items with all the query terms to the top
>> *without
>> changing their order*.
>> 
>> My first thought was to use a simple boost query
>> allfields:(a AND b AND c),
>> but the order of the set of records that contain all three
>> terms changes
>> when I do that. What I *think* I need to do is basically to
>> say, "Hey, all
>> the items with all three terms get an extra 40,000 points,
>> but change
>> nothing else".
> 
> This is a hard task, and I am not sure it is possible. But you need to change similarity
algorithm for that. Final score is composed of many factors. coord, norm, tf-idf ... 
> 
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html
> 
> May be you can try to customize coord(q,d). But there can be always some cases that you
describe. For example very long document containing three terms will be punished due to its
length. A very short document with two query terms can pop-up before it.
> 
> It is easy to "rank items with all three terms" so that they comes first, (omitNorms="true"
and omitTermFreqAndPositions="true" should almost do it) but "change nothing else" part is
not.
> 
> Easiest thing can be throw additional query with pure AND operator and display these
result in a special way.
> 
> 
> 


Mime
View raw message