lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-2838) ConstantScoreQuery should directly support wrapping Query and simply strip off scores
Date Thu, 30 Dec 2010 19:37:46 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-2838:
----------------------------------

    Attachment: LUCENE-2838-no-topscorer-opt.patch

After thinking one day about it, I found some problems with the "collector hack" and this
style of decorator pattern:
- If you wrap multiple times, the setScorer() method in the wrapped collector may set the
wrong scorer (you see this, if you wrap multiple ConstantScoreQueries on top of each other,
then the boost of the inner one is returned. The problem is that the score(Collector) method
inverts the decorator pattern.
- The inner scorer (like BoolenScorer with its buckets) may set a different scorer in the
collector than itsself that implements doc() different, so setting the ConstantScorer always
as collector's scorer can lead to wrong results (we dont see this in the test, as no collector
uses Scorer.doc(), only Scorer.score(), which returns constant).

I changed the code so CSQ now passes always topScorer=false to Weight.scorer() of the wrapped
query and does not overwrite score(Collector,...) methods. It still allows out-of-order collection.
Now BooleanScorer2 is always used with MTQs.

The question is, would the previous but broken optimization make sense for speed? Mike/Mark?

> ConstantScoreQuery should directly support wrapping Query and simply strip off scores
> -------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2838
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2838
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2838-no-topscorer-opt.patch, LUCENE-2838.patch, LUCENE-2838.patch
>
>
> Especially in MultiTermQuery rewrite modes we often simply need to strip off scores from
Queries and make them constant score. Currently the code to do this looks quite ugly: new
ConstantScoreQuery(new QueryWrapperFilter(query))
> As the name says, QueryWrapperFilter should make any other Query constant score, so why
does it not take a Query as ctor param? This question was aldso asked quite often by my customers
and is simply correct, if you think about it.
> Looking closer into the code, it is clear that this would also speed up MTQs:
> - One additional wrapping and method calls can be removed
> - Maybe we can even deprecate QueryWrapperFilter in 3.1 now (it's now only used in tests
and the use-case for this class is not really available) and LUCENE-2831 does not need the
stupid hack to make Simon's assertions pass
> - CSQ now supports out-of-order scoring and topLevel scoring, so a CSQ on top-level now
directly feeds the Collector. For that a small trick is used: The score(Collector) calls are
directly delegated and the scores are stripped by wrapping the setScorer() method in Collector
> During that I found a visibility bug in Scorer (LUCENE-2839): The method "boolean score(Collector
collector, int max, int firstDocID)" should be public not protected, as its not solely intended
to be overridden by subclasses and is called from other classes, too! This leads to no compiler
bugs as the other classes that calls it is mainly BooleanScorer(2) and thats in same package,
but visibility is wrong. I will open an issue for that and fix it at least in trunk where
we have no backwards-requirement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message