lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <>
Subject Re: Proposal: Scorer api change
Date Wed, 09 Jun 2010 12:16:18 GMT
What I have in mind is basically having two parallel trees - one for
matching, one for scoring.
Matching tree is completely independent and can be used as a filter
with sort-by-field approach, for example.
Scoring tree nodes have references to corresponding matching tree
nodes, so they can exploit their "current state".

Both trees are built with a visitor over some AST produced from
textual query, or programmatically.
So what you have to do is to write said visitors. Some of the basic
scorers can be reused by your custom visitor, so voila - we have nice
extensibility by composition, instead of extensibility by inheritance
(which sucks). Also, all this custom code is gathered in a single
class, instead of being spread over your query derivatives.
This is not a final design, lots of things can differ. I.e. - trees
don't have to be parallel. If we want some query branch to not affect
the score, but do matching, we're currently wrapping it in
ConstantScoreQuery, in my design the matcher tree will look as is, but
corresponding scorer tree branch will be replaced by ConstantScore.

2010/6/9 Shai Erera <>:
> I don't feel comfortable with the statement "these visitors are then free to
> specialize on matchers or not ...". Let's think how this API will be used ..
> today, the user has two hooks - the QueryParser and Collector. Collector
> allows you to plug in your own and by extending QP you can return your own
> Query for different fragments.
> The Query is a full set though - Query + Weight + Scorer. Whether you extend
> an existing query and just override one of the methods is up to you, but
> still the Query is self contained.
> If we break the Query API down to a Matcher and Scorer, how will you provide
> your own Scorer? Collector is independent of the Query - it just collects
> the results. Will the Scorer be independent of Query too (and become an
> argument)? I don't think so, 'cause you want to know
> which Matcher you're up against in order to write a good Scorer. There's no
> point passing in a PhraseScorer if the query does not include any
> PhraseMatcher. So will you need to extend Query, to return your own custom
> Scorer, for certain fragments? Can't you do it today already (given the API
> is not final, is public/protected etc.)
> Earwin - is that what you had in mind? If so, let's think first if the
> current API is not sufficient, given that we 'open' it for extension ...
> e.g., can someone achieve that by extending PhraseQuery, override
> createScorer and return his own? Do we need more than that?
> I'm not saying we should refactor the API to Matcher + Scorer, just thinking
> on what do we really need to do and what's the best way to achieve that.
> Shai
> On Wed, Jun 9, 2010 at 2:24 PM, Earwin Burrfoot <> wrote:
>> > Can we represent the Query
>> > state in some general structure, that no matter which Query you get,
>> > you'll
>> > know how to score it?
>> No. You could go for unified interface that allows you to express
>> different query states, like a set of untyped key-values, but you'll
>> end up switching on these keyvalues in the end.
>> It's better to define a set of matchers, and then produce visitors
>> that compute scores. These visitors are then free to specialize on
>> matchers or not, or ignore the whole tree completely.
>> --
>> Kirill Zakharenko/Кирилл Захаренко (
>> Phone: +7 (495) 683-567-4
>> ICQ: 104465785
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

Kirill Zakharenko/Кирилл Захаренко (
Phone: +7 (495) 683-567-4
ICQ: 104465785

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message