lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tod Olson <...@uchicago.edu>
Subject Re: How to limit SimpleCollector at N documents?
Date Fri, 18 Aug 2017 16:01:02 GMT
Thank you, this looks exactly like what I need!

-Tod

On Aug 18, 2017, at 1:42 AM, Adrien Grand <jpountz@gmail.com<mailto:jpountz@gmail.com>>
wrote:

You could wrap a collector wrapper (have a look at FilterCollector maybe) that throws a CollectionTerminatedException
whenever more than X hits have been collected in total. It will likely stop in the middle
of the first segment, and then before collecting further segments.

FYI you can not only throw a CollectionTerminatedException from the collect method, but also
from the getLeafCollector method, which allows to skip a segment entirely before even starting
to find a match.

We have such a collector in Elasticsearch, feel free to copy-paste it and adapt to your needs
if you want. It is licensed under ASL2: https://github.com/elastic/elasticsearch/blob/36a5cf8f35e5cbaa1ff857b5a5db8c02edc1a187/core/src/main/java/org/elasticsearch/search/query/EarlyTerminatingCollector.java

Le jeu. 17 août 2017 à 21:46, Tod Olson <tod@uchicago.edu<mailto:tod@uchicago.edu>>
a écrit :
Hi everyone,

I'm modifying an existing application, which uses a Lucene SimpleCollector to return document
ids and some other fields from a search. For various reasons, we now want to place an upper
bound on the number of documents actually collected.

Is there a reasonable way to put a limit on the results returned by a SimpleCollector? Or
do I need to change Collectors?

Based on the docs, I could keep a counter and raise a CollectionTerminatedException after
N documents, but then the search moves on to the next leaf. I'd like to have the entire search
terminate and return the collected documents.

Any assistance for a Lucene novice is greatly appreciated!

-Tod


Tod Olson <tod@uchicago.edu<mailto:tod@uchicago.edu><mailto:tod@uchicago.edu<mailto:tod@uchicago.edu>>>
Systems Librarian
Interim Director for Integrated Library Systems
University of Chicago Library


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message