lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: AIOOBE in extension of ToParentBlockJoinQuery
Date Sat, 07 Feb 2015 22:30:58 GMT
So I dug into this a bit further, and find it happens with the stock 
Lucene query as well, in 4.10.2.  I looked at the code on trunk, and I 
don't think the situation is different there.  Basically if you delete a 
parent document, orphaning some child document(s) and then merge, the 
TPBJQuery fails with an exception if the orphaned docs are matched. I 
feel like this is pretty surprising, and it's not that hard to do 
something more expected; the orphaned children can just be skipped by 
testing for this condition (instead of just asserting the contrary, as 
we do now).

If some committer speaks up and agrees, I'll at least open an issue.

-Mike

On 2/5/2015 12:02 PM, Michael Sokolov wrote:
> I've run into an exception, and I'm trying to understand whether it is 
> something that can just happen if the index doesn't conform to the 
> expectations of the TPBJQ, or if I've somehow messed things up in my 
> extension of that query.
>
> The exception I'm seeing is in BlockJoinScorer.nextDoc().  It's clear 
> that the assertion below is being contravened; No parentDoc is found 
> for some child doc.
>
>         // Gather all children sharing the same parent as
>         // nextChildDoc
>
>         parentDoc = parentBits.nextSetBit(nextChildDoc);
>         assert parentDoc != -1;
>
>         //System.out.println("  nextChildDoc=" + nextChildDoc);
>         if (
>             // parentDoc = -1 shouldn't happen, but it did.  I'm not sure
>             // if this is a consequence of our allowing parents to be a
>             // child -- I don't think so -- it seems more likely the 
> index
>             // can just get in a state where there are children with no
>             // parent, and that could cause this?
>             parentDoc == -1 ||
>             (acceptDocs != null && !acceptDocs.get(parentDoc))
>             ) {
>           // Parent doc not accepted; skip child docs until
>           // we hit a new parent doc:
>           do {
>             nextChildDoc = childScorer.nextDoc();
>           } while (nextChildDoc <= parentDoc);
>
> What I'm wondering is why we believe that assertion?  Is there 
> something that guarantees the state of the index beyond the user 
> having indexed their documents correctly?
>
> I'm concerned that a change I made to the query may be causing this, 
> but I can't see how.  What I did is to allow a parent doc to also be a 
> child doc, and I also passed acceptDocs when creating the childScorer, 
> so that child docs are filtered by the prevailing filter, as well as 
> parent docs.
>
> Any pointers or ideas welcome !  Thanks
>
> -Mike


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message