lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Keegan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-5831) Scale score PostFilter
Date Tue, 11 Mar 2014 13:09:45 GMT

    [ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930313#comment-13930313
] 

Peter Keegan commented on SOLR-5831:
------------------------------------

> How is this performing compared to using the scale() function?
No comparison. I'm running Solr on a 4-vCPU EC2 instance and tested with SolrMeter.
On a production index (1.6 million docs) and production queries at a leisurely rate of 10
QPS:

1. scale() with function query:
Median response time: 3000 ms
Ave response time: 8000 ms
Load average: double digits 

2. PostFilter with maxscalehits=0 (rows=50):
Median response time: 18 ms
Ave response time: 108 ms
Load average:  <1

3. PostFilter with maxscalehits=10000:
Median response time: 21 ms
Ave response time: 120 ms
Load average: <1 

4. PostFiilter with maxscalehits=-1 (scale all hits)
Worse than #1. Most queries timed-out. 
This is not surprising since the PriorityQueue size is often huge from high hit counts, and
all hits are delegated.

Regarding the QueryResultCache, are there any suggestions on how to determine its size in
the context of the PostFilter?

Thanks,
Peter


> Scale score PostFilter
> ----------------------
>
>                 Key: SOLR-5831
>                 URL: https://issues.apache.org/jira/browse/SOLR-5831
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 4.7
>            Reporter: Peter Keegan
>            Priority: Minor
>         Attachments: SOLR-5831.patch
>
>
> The ScaleScoreQParserPlugin is a PostFilter that performs score scaling.
> This is an alternative to using a function query wrapping a scale() wrapping a query().
For example:
> select?qq={!edismax v='news' qf='title^2 body'}&scaledQ=scale(product(query($qq),1),0,1)&q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))&fq={!query
v=$qq}
> The problem with this query is that it has to scale every hit. Usually, only the returned
hits need to be scaled,
> but there may be use cases where the number of hits to be scaled is greater than the
returned hit count,
> but less than or equal to the total hit count.
> Sample syntax:
> fq={!scalescore+l=0.0 u=1.0 maxscalehits=10000 func=sum(product(sscore(),0.75),product(field(myfield),0.25))}
> l=0.0 u=1.0 		//Scale scores to values between 0-1, inclusive 
> maxscalehits=10000 	//The maximum number of result scores to scale (-1 = all hits, 0
= results 'page' size)
> func=... 			//Apply the composite function to each hit. The scaled score value is accessed
by the 'score()' value source
> All parameters are optional. The defaults are:
> l=0.0 u=1.0
> maxscalehits=0 (result window size)
> func=(null)
>  
> Note: this patch is not complete, as it contains no test cases and may not conform 
> to all the guidelines in http://wiki.apache.org/solr/HowToContribute. 
>  
> I would appreciate any feedback on the usability and implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message