jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-6391) With FastQuerySize, getSize() returns -1 if there are exactly 21 rows
Date Mon, 03 Jul 2017 09:15:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072154#comment-16072154

Thomas Mueller commented on OAK-6391:

20 is the minimum number of pre-fetched entries, that means 20 are always read. If the result
is small, then the size is known (no need to guess). If there are more entries, then depending
on the FastResultSize setting, either -1 is returned (for unknown), or the index is asked
for an estimate.

The root cause was: it was reading 20 entries, then asked the index for the size estimate.
However, if there are exactly 21 entries, then on the oak-core side the cursor is already
closed, so -1 is returned. The cursor is not closed if there are actually more entries.

The solution I implemented is: just before asking the index for a size estimate, check if
there are more entries. It is not needed to ask for an estimate if there are no more entries.
That means: before calling getSize, first call hasNext. If hasNext returns false, then no
need to call getSize.

> With FastQuerySize, getSize() returns -1 if there are exactly 21 rows
> ---------------------------------------------------------------------
>                 Key: OAK-6391
>                 URL: https://issues.apache.org/jira/browse/OAK-6391
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Critical
>              Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4, candidate_oak_1_6
>             Fix For: 1.8, 1.7.3, 1.6.3
> If FastQuerySize is enabled, and the query result has exactly 21 rows, then getSize()
returns -1. With 20 or 22 rows, the correct value is returned.
> Comment:
> One can not assume getSize() _always_ returns a value larger than 0. Sometimes, getSize()
returns too many, it could also return too few; it is just an estimate. The loop used to read
the rows should _only_ have "it.hasNext() && i < maxCount" as a condition. With
maxCount for example 30. If the number of rows read is smaller than 30, then that's the real
row count. If it's 30, then you can use getSize() as an estimation. That can still be -1 for
"unknown", even with FastQuerySize enabled.
> But it's true that -1 is unexpected in this case.

This message was sent by Atlassian JIRA

View raw message