jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Egli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-2480) Incremental (FileStore)Backup copies the entire source instead of just the delta
Date Tue, 10 Mar 2015 16:59:39 GMT

    [ https://issues.apache.org/jira/browse/OAK-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355201#comment-14355201

Stefan Egli commented on OAK-2480:

[~alex.parvulescu], what's your take on this one, something for 1.2 or 1.4 ? thx.

> Incremental (FileStore)Backup copies the entire source instead of just the delta
> --------------------------------------------------------------------------------
>                 Key: OAK-2480
>                 URL: https://issues.apache.org/jira/browse/OAK-2480
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: run
>    Affects Versions: 1.1.5
>            Reporter: Stefan Egli
>         Attachments: IncrementalBackupTest.java, oak-2480.incremental.partial.patch
> Running the FileStoreBackup (in oak-run) sequentially should correspond to an incremental
backup. This implies the expectation, that the incremental backup is very resource-friendly,
ie that it only adds the delta/diff that changed since the last backup. Instead what can be
een at the moment, is that it copies the entire source-store again on each 'incremental' backup.
> Tested with the latest trunk snapshot.
> Suspecting the problem to be as follows: on the first backup the FileStoreBackup stores
a checkpoint created in the source-store and adds it as a property "checkpoint" to the backup
root node, besides the actual backup which is stored in '/root'. 
> On subsequent incremental runs, the backup tries to retrieve said property "checkpoint"
from the backup and uses that in the compactor to do the diff based upon.
> Now the problem seems to be that in Compactor.compact it goes to call process(), which
does a writer.writeNode(before) (where before is the checkpoint in the origin store but writer
is a writer of the backup store). And in this SegmentWriter.writeNode() it fails to find the
'before' segment, and thus traverses the entire tree and copies it from the origin to the
> So the problem looks to be in the area where it assumes to find this 'checkpoint-before'
in the backup but that's not the case.
> So a solution would have been to not do the diff between the checkpoint and the current
origin-head, but between the backup-head and the origin-head instead. Now apparently this
was not the intention though, as that would mean to read through the entire backup for doing
the diffing - and that would be inefficient...

This message was sent by Atlassian JIRA

View raw message