jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francesco Mari (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
Date Thu, 01 Sep 2016 15:21:20 GMT

    [ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15455750#comment-15455750

Francesco Mari commented on OAK-4740:

To solve this issue, I need to implement a {{TarRecovery}} strategy that parses the raw segment
data to extract the references to other segment and the references to external binaries.

The references to other segments are easy to retrieve, since they are already stored in the
segment and easily accessible via the {{Segment}} API.

The references to external binaries are not as accessible, because those references were removed
for good when the unified index of references to external binaries was introduced in the TAR
file. To retrieve those references, I need to parse the records stored in the segment and
identify all of those records that actually represent references to external binaries. This
might not be an easy task for a number of reasons. First, the lack of typing information is
a big issue when it comes to parse the data in the segment without resorting to further context
(see OAK-2498). Second, a reference to an external binary might be too long to be stored in
a data segment and should be read from a data segment, further complicating the implementation
of the {{TarRecovery}}. One would argue why a binary ID should be bigger than ~16K, but this
doesn't seem to be the right place to rant about this.

I scheduled this issue a couple of versions ahead to have enough time to tackle OAK-2498 and
OAK-4659, whose combination might be beneficial to the resolution of this issue.

> TarReader recovery skips generating the index and binary graphs
> ---------------------------------------------------------------
>                 Key: OAK-4740
>                 URL: https://issues.apache.org/jira/browse/OAK-4740
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>            Reporter: Alex Parvulescu
>            Assignee: Francesco Mari
>             Fix For: Segment Tar 0.0.16
> As noticed from the tar recovery bits [0] the resulting tar file would lack the binary
reference graph and index graph. This has implications on the DSGC (not properly reporting
binary references would result in binaries being GC'ed) and GC operations.
> / cc [~frm], [~mduerig]
> [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216

This message was sent by Atlassian JIRA

View raw message