lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <>
Subject Re: Anyway to not bother scoring less good matches ?
Date Wed, 04 May 2011 12:34:09 GMT
On 04/05/2011 12:51, Paul Taylor wrote:
> On 04/05/2011 12:39, Ahmet Arslan wrote:
>> Im receiving a number of searches with many ORs so that the total 
>> number of matches is huge (>  1 million) although only the first 20 
>> results are required. Analysis shows most time is spent scoring the 
>> results. Now it seems to me if you sending a query with 10 OR 
>> components, documents that match most of the terms are bound to get a 
>> better score than a match that only matches one or two of the terms.  
>> So does lucene do any optimization to not bother working out the 
>> scores of the poor matches.
>> EDIT:Actually not sure the statement because if only term matches it 
>> could still get the highest score if the match was on the shortest term.
>> But can you see my point is there way to get lucene discount the less 
>> good matches without scoring them, or is there another approach. At 
>> the moment we allow the full lucene syntax and use QueryParser to 
>> parse a query and pass the resultant query to search unchanged 
>> (execpt for handling of numeric fields), should I be modifying the 
>> query somehow ?
>> You can restrict number of returned results by using a adaptively 
>> computed BooleanQuery.html#setMinimumNumberShouldMatch(int) parameter.
>> For example, If you have 10 optional clauses you can set minimum 
>> should match to 60% of 10 = 6.
>> Similar mechanism exists in solr :

> Thanks for the hint, so this could be done by overriding 
> getBooleanQuery() in QueryParser ?
> Paul
Well I did extend QuerParser, and the method is being called but rather 
disappointingly it had no noticeablke effect on how long queries took. I 
really thought by reducing the number of matches the corresponding 
scoring phase would be quicker.

     protected Query getBooleanQuery(List<BooleanClause> clauses, 
boolean disableCoord)
         throws ParseException
         BooleanQuery query = (BooleanQuery) 
             if(clauses.size() > 5)
         return query;

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message