jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Commented] (OAK-4293) Refactor / rework compaction gain estimation
Date Mon, 08 Aug 2016 15:28:20 GMT

    [ https://issues.apache.org/jira/browse/OAK-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411949#comment-15411949

Michael Dürig commented on OAK-4293:

Nice! I like the {{GCEstimation}} abstraction, which allows for future evolution. My main
concern is the dependency of {{SizeDeltaGcEstimation}} to {{FileStoreStats}} (via {{FileStoreStats#getPreviousCleanupSize}}).
I would prefer this the other way around: {{SizeDeltaGcEstimation}} would depend on {{GCJournalWriter}}
directly. IMO {{FileStoreStats}} should be "monitoring only". 
A minor point is naming: I would prefer {{GCJournal}} to {{GCJournalWriter}}. As it actually
also covers the reading part. 

> Refactor / rework compaction gain estimation 
> ---------------------------------------------
>                 Key: OAK-4293
>                 URL: https://issues.apache.org/jira/browse/OAK-4293
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Alex Parvulescu
>              Labels: gc
>             Fix For: Segment Tar 0.0.10
>         Attachments: size-estimation.patch
> I think we have to take another look at {{CompactionGainEstimate}} and see whether we
can up with a more efficient way to estimate the compaction gain. The current implementation
is expensive wrt. IO, CPU and cache coherence. If we want to keep an estimation step we need
IMO come up with a cheap way (at least 2 orders of magnitude cheaper than compaction). Otherwise
I would actually propose to remove the current estimation approach entirely 

This message was sent by Atlassian JIRA

View raw message