lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-5831) Scale score PostFilter
Date Sun, 09 Mar 2014 16:05:43 GMT

    [ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925227#comment-13925227
] 

Joel Bernstein commented on SOLR-5831:
--------------------------------------

Peter,

I was able to do a first review of the code before heading out on vacation.

Very cool piece of code. How is this performing compared to using the scale() function?

The following issues were in early versions of the CollaspingQParserPlugin so you can look
at the most recent version to see how they were resolved:

1) The ScoreScaleFilter class needs to only have instance variables that are needed for the
hashCode() and equals() method otherwise they'll be all kinds of bugs with the Solr caches.
So any work you're doing in the constructor of this class and hanging onto needs to be moved
to the getFilterCollector() method.

2) The DummyScore also needs to implement the docID() method. Pretty simple to do, check the
latest CollapsingQParserPlugin to see how this is handled.

3) I think getting this working with the QueryResultCache will be important. Early versions
of the CollapsingQParserPlugin didn't do this, but standard grouping didn't either, so it
wasn't a downgrade in functionality for FieldCollapsing. But people who use this feature will
be surprised if the QueryResultCache stops working. So hashCode() and equals() will need to
be implemented.

4) The value source needs a proper context (rcontext in the code). Latest version of the CollapsingQParserPlugin
demonstrates this as well.

Also having good tests will be important and probably somewhat tricky to write.  Using some
form of randomized testing would be good to ensure that random scores get normalized properly.

I'll checkin on this when I get back from vacation.

Joel

  




> Scale score PostFilter
> ----------------------
>
>                 Key: SOLR-5831
>                 URL: https://issues.apache.org/jira/browse/SOLR-5831
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 4.7
>            Reporter: Peter Keegan
>            Priority: Minor
>         Attachments: SOLR-5831.patch
>
>
> The ScaleScoreQParserPlugin is a PostFilter that performs score scaling.
> This is an alternative to using a function query wrapping a scale() wrapping a query().
For example:
> select?qq={!edismax v='news' qf='title^2 body'}&scaledQ=scale(product(query($qq),1),0,1)&q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))&fq={!query
v=$qq}
> The problem with this query is that it has to scale every hit. Usually, only the returned
hits need to be scaled,
> but there may be use cases where the number of hits to be scaled is greater than the
returned hit count,
> but less than or equal to the total hit count.
> Sample syntax:
> fq={!scalescore+l=0.0 u=1.0 maxscalehits=10000 func=sum(product(sscore(),0.75),product(field(myfield),0.25))}
> l=0.0 u=1.0 		//Scale scores to values between 0-1, inclusive 
> maxscalehits=10000 	//The maximum number of result scores to scale (-1 = all hits, 0
= results 'page' size)
> func=... 			//Apply the composite function to each hit. The scaled score value is accessed
by the 'score()' value source
> All parameters are optional. The defaults are:
> l=0.0 u=1.0
> maxscalehits=0 (result window size)
> func=(null)
>  
> Note: this patch is not complete, as it contains no test cases and may not conform 
> to all the guidelines in http://wiki.apache.org/solr/HowToContribute. 
>  
> I would appreciate any feedback on the usability and implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message