jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Commented] (OAK-5790) Chronologically rebase checkpoints on top of each other during compaction
Date Wed, 22 Mar 2017 14:22:43 GMT

    [ https://issues.apache.org/jira/browse/OAK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936387#comment-15936387

Michael Dürig commented on OAK-5790:

Initial WIP implementation: https://github.com/mduerig/jackrabbit-oak/commit/5c52b9d6ab46b1eec525ed2d74b59f8eef013274

The core idea is to loop compaction through an {{ApplyDiff}} like mechanism similar to what
we do in {{Compactor}} for offline compaction. Non surprisingly and somewhat unimaginative
this class is currently called {{Compactor2}} in my approach. {{Compactor2}} can, with the
help of a {{SegmentWriter}} instance, compact node states on top of already compacted node
states by just applying the differences. A bit care has to be taken here not to lose the stable
id in the process. {{FileStore}} uses a {{Compactor2}} to rebase the checkpoints and the root
inside the super root separately and in chronological order on top of each other. The results
are then reassembled into a super root. 
Additionally, when compaction goes through multiple cycles each subsequent cycle rebases its
oldest checkpoint on top of the compacted root of the previous cycle.

[~frm], could you have a look when time permits?

> Chronologically rebase checkpoints on top of each other during compaction
> -------------------------------------------------------------------------
>                 Key: OAK-5790
>                 URL: https://issues.apache.org/jira/browse/OAK-5790
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: compaction, gc, performance
>             Fix For: 1.8, 1.7.1
> Currently the compactor does just a rewrite of the super root node without any special
handling of the checkpoints. It just relies on the node de-duplication cache to avoid fully
exploding the checkpoints. 
> I think this can be improved by subsequently rebasing checkpoints on top of each other
during compaction. (Very much like checkpoints are handled in migration). 

This message was sent by Atlassian JIRA

View raw message