jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Egli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-2480) Incremental (FileStore)Backup copies the entire source instead of just the delta
Date Tue, 10 Mar 2015 16:59:39 GMT

    [ https://issues.apache.org/jira/browse/OAK-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355201#comment-14355201
] 

Stefan Egli commented on OAK-2480:
----------------------------------

[~alex.parvulescu], what's your take on this one, something for 1.2 or 1.4 ? thx.

> Incremental (FileStore)Backup copies the entire source instead of just the delta
> --------------------------------------------------------------------------------
>
>                 Key: OAK-2480
>                 URL: https://issues.apache.org/jira/browse/OAK-2480
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: run
>    Affects Versions: 1.1.5
>            Reporter: Stefan Egli
>         Attachments: IncrementalBackupTest.java, oak-2480.incremental.partial.patch
>
>
> Running the FileStoreBackup (in oak-run) sequentially should correspond to an incremental
backup. This implies the expectation, that the incremental backup is very resource-friendly,
ie that it only adds the delta/diff that changed since the last backup. Instead what can be
een at the moment, is that it copies the entire source-store again on each 'incremental' backup.
> Tested with the latest trunk snapshot.
> Suspecting the problem to be as follows: on the first backup the FileStoreBackup stores
a checkpoint created in the source-store and adds it as a property "checkpoint" to the backup
root node, besides the actual backup which is stored in '/root'. 
> On subsequent incremental runs, the backup tries to retrieve said property "checkpoint"
from the backup and uses that in the compactor to do the diff based upon.
> Now the problem seems to be that in Compactor.compact it goes to call process(), which
does a writer.writeNode(before) (where before is the checkpoint in the origin store but writer
is a writer of the backup store). And in this SegmentWriter.writeNode() it fails to find the
'before' segment, and thus traverses the entire tree and copies it from the origin to the
backup.
> So the problem looks to be in the area where it assumes to find this 'checkpoint-before'
in the backup but that's not the case.
> So a solution would have been to not do the diff between the checkpoint and the current
origin-head, but between the backup-head and the origin-head instead. Now apparently this
was not the intention though, as that would mean to read through the entire backup for doing
the diffing - and that would be inefficient...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message