lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: NRT + static rank based sorting
Date Tue, 09 Jul 2013 21:13:38 GMT
Hi Sriram,

On Tue, Jul 9, 2013 at 5:06 AM, Sriram Sankar <sankar@gmail.com> wrote:
> I've finally got something running and will send you some performance
> numbers as promised shortly.  In the meanwhile, I've a question regarding
> the use of real time indexing along with ordering by static rank.  Before
> each search, I do the reopen as follows:
>
>     public void refresh() throws IOException {
> DirectoryReader r = DirectoryReader.openIfChanged(reader);
> if (r != null) {
>     reader.close();
>     reader = r;
>     this.live = SortingAtomicReader.wrap(
>                 new SlowCompositeReaderWrapper(reader),
> new StaticRankSorter());
> }
>     }
>
> This works fine.  However, I believe the index is resorted everytime I
> reopen the index.  Ideally, it would be nice to do the sort more
> incrementally each time a new document gets added.  I assume that this is
> not easy - but just in case you have ideas, I'd like to hear them.

I think a good trade-off could be to fully collect the small segments
that come from incremental updates. Since they are small, collecting
them will be fast anyway. One the opposite, the bottleneck is likely
the collection of large segments. This is why we chose to tackle the
problem of online sorting using a merge policy (SortingMergePolicy).
Segments are only sorted when merging, meaning that small NRT
(flushed) segments won't be sorted but large (merged) segments will
be.

Then computing the top hits is just a matter of computing the best
hits on every segment and merging them into a single hit list:
 - for flushed segments, you need to fully collect them like Lucene
does by default,
 - for sorted segments, you can early-terminate collection on a
per-segment basis when enough matchs have been collected.

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message