jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Updated] (OAK-3177) Compaction slow on repository with continuous writes
Date Thu, 06 Aug 2015 12:40:04 GMT

     [ https://issues.apache.org/jira/browse/OAK-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Michael Dürig updated OAK-3177:
    Attachment: OAK-3177.png


Attaching a graph showing average compaction times with and without the patch for 7 subsequent
compaction cycles on a repository with 5 concurrent writer threads ({{SegmentCompactionIT}}).
The graphs show the times of the individual compaction cycles normalised against the first

> Compaction slow on repository with continuous writes
> ----------------------------------------------------
>                 Key: OAK-3177
>                 URL: https://issues.apache.org/jira/browse/OAK-3177
>             Project: Jackrabbit Oak
>          Issue Type: Sub-task
>          Components: segmentmk
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: compaction, gc
>             Fix For: 1.3.5
>         Attachments: OAK-3177.patch, OAK-3177.png
> OAK-2734 introduced retry cycles and the option to force compaction when all cycles fail.
However OAK-2192 introduced a performance regression: each compaction cycle takes in the order
of the size of the repository to complete instead of in the order of the number of remaining
changes to compact. This is caused by comparing compacted with pre-compacted node states,
which is necessary to avoid mixed segments (aka OAK-2192). To fix the performance regression
I propose to pass the compactor an additional node state (the 'onto' state). The diff would
then be calculated across the pre compacted states, which performs in the order of number
of changes. The changes would then be applied to the 'onto' state, which is a compacted state
to avoid mixed segments. 

This message was sent by Atlassian JIRA

View raw message