lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tod Olson <...@uchicago.edu>
Subject Re: How to limit SimpleCollector at N documents?
Date Fri, 18 Aug 2017 16:00:27 GMT
dr,

I can't speak authoritatively, but have seen references to leafs in the API documents and
am making some inferences. It seems to have to do with how the indexes are organized internally.
From that and comments in the LeafReaderContext doc, it seems like search contexts are arranged
in a hierarchical fashion, with parents and children. See the javadoc for the collect() method
of the SimpleCollector class:

http://lucene.apache.org/core/6_6_0/core/org/apache/lucene/search/SimpleCollector.html

This seems familiar from other inverted file indexing systems, but I've not found a real detailed
description of how this works in Lucene.

-Tod

On Aug 18, 2017, at 12:59 AM, dr <bforevdr@163.com<mailto:bforevdr@163.com>> wrote:

i used to do the same thing. My way is also throwing exception to jump out. What does "then
the search moves on to the next leaf" mean ?
在 2017-08-18 03:46:02,"Tod Olson" <tod@uchicago.edu<mailto:tod@uchicago.edu>>
写道:
Hi everyone,

I'm modifying an existing application, which uses a Lucene SimpleCollector to return document
ids and some other fields from a search. For various reasons, we now want to place an upper
bound on the number of documents actually collected.

Is there a reasonable way to put a limit on the results returned by a SimpleCollector? Or
do I need to change Collectors?

Based on the docs, I could keep a counter and raise a CollectionTerminatedException after
N documents, but then the search moves on to the next leaf. I'd like to have the entire search
terminate and return the collected documents.

Any assistance for a Lucene novice is greatly appreciated!

-Tod


Tod Olson <tod@uchicago.edu<mailto:tod@uchicago.edu><mailto:tod@uchicago.edu>>
Systems Librarian
Interim Director for Integrated Library Systems
University of Chicago Library


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message