jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Commented] (OAK-7672) Introduce oak-run segment-copy for moving around segments in different storages
Date Thu, 09 Aug 2018 12:21:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574758#comment-16574758

Michael Dürig commented on OAK-7672:

[~dulceanu] without looking into too much details, the patch looks very good to me. One thing
I'm not sure about is whether all non segments entries in the tar files are handled properly
(e.g. binary index, graphs files etc.). An interesting way to check might be to implement
a test that copies the segments from tar to azure and back to tar and then ensures the resulting
binaries are the same (binary diff).

Regarding the documentation:
 * the new segment-copy is missing from the table of contents.
 * I prefer the following wording (bold): {{includes __all previous *revisions* persisted
in the Segment Store__ and therefore *retaining the entire history*.}}

> Introduce oak-run segment-copy for moving around segments in different storages
> -------------------------------------------------------------------------------
>                 Key: OAK-7672
>                 URL: https://issues.apache.org/jira/browse/OAK-7672
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: oak-run, segment-tar
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Major
>              Labels: tooling
>             Fix For: 1.10, 1.9.7
>         Attachments: OAK-7672.patch
> Often there's the need to transform a type of {{SegmentStore}} (e.g. local TarMK) into
*the exact same* counter-part, using another persistence type (e.g. Azure Segment Store).
While {{oak-upgrade}} partially solves this through sidegrades (see OAK-7623), there's a gap
in the final content because of the level at which {{oak-upgrade}} operates (node store level).
Therefore, the resulting sidegraded repository doesn't contain all the (possibly stale, unreferenced)
data from the original repository, but only the latest head state. A side effect of this is
that the resulting repository is always compacted.
> Introducing a new command in {{oak-run}}, namely {{segment-copy}}, would allow us to
operate at a lower level (i.e. segment persistence), dealing only with constructs from {{org.apache.jackrabbit.oak.segment.spi.persistence}}:
journal file, archives and archive entries. This way the only focus of this process would
be to "translate" a segment between two persistence formats, without caring about the node
logic stored inside (referenced/unreferenced node/property).

This message was sent by Atlassian JIRA

View raw message