lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1427) QueryWrapperFilter should not do scoring
Date Tue, 28 Oct 2008 14:24:44 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643221#action_12643221
] 

Michael McCandless commented on LUCENE-1427:
--------------------------------------------

bq. Perhaps we should have a OneTimeDocIdSet for cases like this one, and leave the possibility
to repeatedly generate a DocIdSetIterator to caching filters

I'm torn on this. It's nice in that it'd "forcefully" remind you that if you are re-using
a filter you really should cache it.  But, then, there are legitimate cases where you don't
want to cache it (eg you know you will use it rarely, it's fast enough, and you don't want
to spend the RAM).

Also, for this instance it'd be a break in back compatibility since you can currently re-use
a QueryWrapperFilter instance.

So I guess I'm leaning back towards my original patch, which still allows re-use, but does
not waste CPU computing scores which are just discarded.

> QueryWrapperFilter should not do scoring
> ----------------------------------------
>
>                 Key: LUCENE-1427
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1427
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> The purpose of QueryWrapperFilter is to simply filter to include the docIDs that match
the query.
> Its implementation is wasteful now because it computes scores for those matching docs
even though the score is unused.  We could fix this by getting a Scorer and iterating through
the docs without asking for the score:
> {code}
> Index: src/java/org/apache/lucene/search/QueryWrapperFilter.java
> ===================================================================
> --- src/java/org/apache/lucene/search/QueryWrapperFilter.java	(revision 707060)
> +++ src/java/org/apache/lucene/search/QueryWrapperFilter.java	(working copy)
> @@ -62,11 +62,9 @@
>    public DocIdSet getDocIdSet(IndexReader reader) throws IOException {
>      final OpenBitSet bits = new OpenBitSet(reader.maxDoc());
>  
> -    new IndexSearcher(reader).search(query, new HitCollector() {
> -      public final void collect(int doc, float score) {
> -        bits.set(doc);  // set bit for hit
> -      }
> -    });
> +    final Scorer scorer = query.weight(new IndexSearcher(reader)).scorer(reader);
> +    while(scorer.next())
> +      bits.set(scorer.doc());
>      return bits;
>    }
> {code}
> Maybe I'm missing something, but this seams like a simple win?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message