lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-user] Hits offset and search performarce
Date Tue, 13 Nov 2012 17:33:21 GMT
On Tue, Nov 13, 2012 at 8:47 AM, Thomas den Braber <> wrote:
> Peter Karman wrote on 11/12/2012 5:46 PM
>> Thomas den Braber wrote on 11/12/12 3:53 AM:
>> > On Sun, Nov 11, 2012 at 04:19 AM, Marvin Humphrey <>

>> >> I would assume that Swish-e and Lucy are implemented differently.  I don't
>> >> know what seek() does in the context of Swish-e.
>> >
>> > Seek will fast forward through the search result without first specifying
>> > the total hits you want to collect and not reading the results that
>> > exists before the seek pointer.  In swish you also do not have to say in
>> > advance how many hits you want.
>> $hits->seek(10); # skip the first 9 hits
>> This is similar to the Lucy::Index::Lexicon->seek() method.
>> It would be useful to have it for Lucy::Search::Hits too, imo.
> That would be nice. It also would be useful to change the num_wanted and
> offset ofter the '$searcher->hits()' call has been done, without performing
> the search again with a different offset.

In Lucy, it would be impossible to change `num_wanted` and `offset` after the
fact arbitrarily without rerunning the search.  If the priority queue had a size
of 30 because `offset` was 20 and `num_wanted` was 10, we only have the top 30
hits.  You can't seek to 31, 50, 100 or whatever without rerunning the search
with a bigger priority queue.

I would oppose a seek() which runs implicit searches behind the scenes because
it would surprise users with a hidden performance cost.

If the lack of seek() is forcing you to re-architect your program while
transitioning to Lucy, then it is forcing you to re-architect it in ways which
make best use of Lucy.  Adding a deceptive interface on top of Lucy to mimic
Swish-e wouldn't do anybody any favors -- in fact, it would be actively
harmful, as it would encourage people to use Lucy inefficiently.  Then we'd

Marvin Humphrey

View raw message