jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Parvulescu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (OAK-3362) Estimate compaction based on diff to previous compacted head state
Date Mon, 07 Sep 2015 12:20:46 GMT
Alex Parvulescu created OAK-3362:

             Summary: Estimate compaction based on diff to previous compacted head state
                 Key: OAK-3362
                 URL: https://issues.apache.org/jira/browse/OAK-3362
             Project: Jackrabbit Oak
          Issue Type: Sub-task
          Components: segmentmk
            Reporter: Alex Parvulescu
            Priority: Minor

Food for thought: try to base the compaction estimation on a diff between the latest compacted
state and the current state.

* estimation duration would be proportional to number of changes on the current head state
* using the size on disk as a reference, we could actually stop the estimation early when
we go over the gc threshold.
* data collected during this diff could in theory be passed as input to the compactor so it
could focus on compacting a specific subtree

* need to keep a reference to a previous compacted state. post-startup and pre-compaction
this might prove difficult (except maybe if we only persist the revision similar to what the
async indexer is doing currently)
* coming up with a threshold for running compaction might prove difficult
* diff might be costly, but still cheaper than the current full diff

This message was sent by Atlassian JIRA

View raw message