jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-2556) do intermediate commit during async indexing
Date Tue, 03 Mar 2015 09:04:06 GMT

    [ https://issues.apache.org/jira/browse/OAK-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344768#comment-14344768
] 

Thomas Mueller commented on OAK-2556:
-------------------------------------

Making this configurable (if that's easy) is a good idea.

I don't think it's a big problem if the index is inconsistent within a revision, if this is
documented. If enabled, only some nodes of a large transaction might be visible in a query,
but that would only apply if async indexes are used (typically for ordering, or full-text
constraints). The query would not return nodes that don't match the conditions, but it would
not return _all_ nodes that match; but this is anyway to be expected when using an async index.
If a stronger guarantee is needed (for example a lookup with propertyName=x), then typically
a synchronous index is used.

> do intermediate commit during async indexing
> --------------------------------------------
>
>                 Key: OAK-2556
>                 URL: https://issues.apache.org/jira/browse/OAK-2556
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: oak-lucene
>    Affects Versions: 1.0.11
>            Reporter: Stefan Egli
>
> A recent issue found at a customer unveils a potential issue with the async indexer.
Reading the AsyncIndexUpdate.updateIndex it looks like it is doing the entire update of the
async indexer *in one go*, ie in one commit.
> When there is - for some reason - however, a huge diff that the async indexer has to
process, the 'one big commit' can become gigantic. There is no limit to the size of the commit
in fact.
> So the suggestion is to do intermediate commits while the async indexer is going on.
The reason this is acceptable is the fact that by doing async indexing, that index is anyway
not 100% up-to-date - so it would not make much of a difference if it would commit after every
100 or 1000 changes either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message