jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Commented] (OAK-4201) Add an index of binary references in a tar file
Date Mon, 18 Jul 2016 12:44:20 GMT

    [ https://issues.apache.org/jira/browse/OAK-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382222#comment-15382222

Michael Dürig commented on OAK-4201:

Nice! Thanks for taking care of this. Will be interesting to see how this affects DSGC performance.

Looking at the commit, I think {{TarReader#getGraphEntrySize}} and {{TarReader#getBinaryReferences}}
should write a log message when they encounter a {{IOException}} instead of silently ignoring

> Add an index of binary references in a tar file
> -----------------------------------------------
>                 Key: OAK-4201
>                 URL: https://issues.apache.org/jira/browse/OAK-4201
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Chetan Mehrotra
>            Assignee: Francesco Mari
>             Fix For: Segment Tar 0.0.4
>         Attachments: OAK-4201-01.patch
> Currently for  Blob GC in case of segment {{SegmentBlobReferenceRetriever}} goes through
all tar files and extracts the binary references. This has 2 issues
> # Logic has go through all the segments in all tar files
> # All segments get loaded in memory once which would affect normal system performance
> This process can be optimized if we also write a file entry in tar (similar to gph i.e.
graph and idx i.e. index files) which has entries of all binary references referred to in
any segment present in that tar file. Then GC logic would just have read this file and avoid
scanning all the segments

This message was sent by Atlassian JIRA

View raw message