jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julian Reschke (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
Date Tue, 21 Feb 2017 12:41:44 GMT

    [ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875914#comment-15875914

Julian Reschke commented on OAK-4780:

bq. Julian Reschke any estimation how long that "eventually" will take on a never cleaned
up 140TB cluster? And what would the GC do when this strategy does not lead to success during
a maintenance interval?

Too many variables in that question.

bq. Measurements from large clusters showed that the collect phase of a GC can take as long
as 4 hours - only processing changes from the last day. How would this work? Let's say there
are 10 million node candidates daily. How would you configure the limit to operate in this

The proposal is about cases where the VGC hasn't been run regularly. If it *does* run regularly
but still can't keep up, we have a different problem, right?

bq. How would the same setting work in a cluster never cleaned up (worst case, I know)?

There seem to be two choices: restrict ourselves to a maintenance window, which may mean that
we'll never recover. Or allow to run beyond the maintenance window.

(FWIW, we currently do not have a defined window, just an interval and the hope that one run
has finished before the next is supposed to start)

> VersionGarbageCollector should be able to run incrementally
> -----------------------------------------------------------
>                 Key: OAK-4780
>                 URL: https://issues.apache.org/jira/browse/OAK-4780
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: core, documentmk
>            Reporter: Julian Reschke
>         Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff
> Right now, the documentmk's version garbage collection runs in several phases.
> It first collects the paths of candidate nodes, and only once this has been successfully
finished, starts actually deleting nodes.
> This can be a problem when the regularly scheduled garbage collection is interrupted
during the path collection phase, maybe due to other maintenance tasks. On the next run, the
number of paths to be collected will be even bigger, thus making it even more likely to fail.
> We should think about a change in the logic that would allow the GC to run in chunks;
maybe by partitioning the path space by top level directory.

This message was sent by Atlassian JIRA

View raw message