lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re:, filter, collector) considered less efficient
Date Fri, 08 Jun 2012 17:40:26 GMT
I think that javadoc is stale; my guess is it was written back when
the collect method took a score, but we changed that so the collector
calls .score() if it really needs the score... so I can't think of why
that search method is inherently inefficient.

I'll fix the javadocs (remove that warning).

Mike McCandless

On Fri, Jun 8, 2012 at 1:32 PM, Paul Hill <> wrote:
> I noticed today that my code calls
> (Query query, Filter filter, Collector collector)
> But also noticed that the DOCs says
> "Applications should only use this if they need all of the matching documents. The high-level
search API (, Filter, int)
> ) is usually more efficient, as it skips non-high-scoring hits."
> Which makes complete sense since I didn't provide it with any count limit.
> My original, but apparently inefficient call is:
>  , securityFilter, dedupingCollector);
> The userQuery is really an enhanced query based on what the user entered, not really
the usersQuery.
> The duplicateCollector uses one fieldCache (FieldCache.DEFAULT.getStrings(reader, deDupField)
to work out which ones to collect and which ones to reject, saving a list of 1st occurrences
of documents.
> I don't think I can use the contrib DuplicateFilter, because my duplicates are not guaranteed
to be in the same index segment.
> So am I being misled by my interpretation of the JavaDoc comment, even though I really
DON'T "need all matching documents" or is there some way to work a count limit and a flitering
into the whole chain of filters and collectors.
> -Paul

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message