lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean
Date Mon, 01 Jun 2009 15:20:07 GMT


Michael McCandless commented on LUCENE-1614:

bq. But then, I don't see the benefit of adding that advance() call, since in the while loop
you'd still need to check whether filterDoc > scorerDoc. So why do it? Is it just a matter
of 'not relying on the above instruction, since at that point I know both should be advanced'?

Since you know both should be advanced, why loop back and do an
unecessary if check?  Ie, I the CPU will do more work (computing a
redundant if) if we don't advance both.

However, my solution (also advancing the scorer) is still not right,
because on cycling back you know scorerDoc >= filterDoc, so we've also
wasted an if check there... hmmm.

So actually let's just stick w/ your current approach, since it's a
straight conversion of what trunk currently does, except can we stop
checking NO_MORE_DOCS on each advance/nextDoc, and only check it when
we're about to collect a doc?

And, anyway, maybe we shouldn't fret so much about it; we know this
filter loop needs to change for 2.9 anyway.  I think we want something
like this:

  * If filter is random access, it should be pushed all the way down
    and applied just like deleted docs

  * If filter is "relatively" sparse compared to query, then we should
    to use the (to-be-added) DISI.check() method

  * Else, add filter as clause on BQ.  If incoming Query is already a
    BQ, clone that and tack filter on as clause; else, make a BQ and
    add both.

bq. Can you explain me what that means? Maybe I should also change it in my eclipse?

We're supposed to do this (tell svn to use native EOL -- "svn propset
svn:eol-style native") for all our sources, to avoid issues like

bq. What if we document that you shouldn't call nextDoc() after it return NO_MORE_DOCS for

OK, let's do that.  We can leave "taking advantage of this" (start
method) to another issue.  But let's be crystal clear on the semantics
of DISI for this issue.

> Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead
of boolean
> ----------------------------------------------------------------------------------------------------
>                 Key: LUCENE-1614
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 2.9
>         Attachments: LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch,
LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch,
> See
for the full discussion. The basic idea is to add variants to those two methods that return
the current doc they are at, to save successive calls to doc(). If there are no more docs,
return -1. A summary of what was discussed so far:
> # Deprecate those two methods.
> # Add nextDoc() and skipToDoc(int) that return doc, with default impl in DISI (calls
next() and skipTo() respectively, and will be changed to abstract in 3.0).
> #* I actually would like to propose an alternative to the names: advance() and advance(int)
- the first advances by one, the second advances to target.
> # Wherever these are used, do something like '(doc = advance()) >= 0' instead of comparing
to -1 for improved performance.
> I will post a patch shortly

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message