jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-1926) UnmergedBranch state growing with empty BranchCommit leading to performance degradation
Date Wed, 23 Jul 2014 06:40:38 GMT

    [ https://issues.apache.org/jira/browse/OAK-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071420#comment-14071420
] 

Marcel Reutegger commented on OAK-1926:
---------------------------------------

bq. only be removed if we do a garbage collection and remove all commits which were part of
those branches

Wouldn't it be sufficient to just remove the _revisions entries from the root document on
startup? For readers those commits from branches that were never merged will appear as non-committed
and will be ignored. 

> UnmergedBranch state growing with empty BranchCommit leading to performance degradation
> ---------------------------------------------------------------------------------------
>
>                 Key: OAK-1926
>                 URL: https://issues.apache.org/jira/browse/OAK-1926
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: mongomk
>    Affects Versions: 1.0.1
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.1
>
>
> In some cluster deployment cases it has been seen that in memory state of UnmergedBranches
contains large number of empty commits. For e.g. in  one of of the runs there were 750 entries
in the UnmergedBranches and each Branch had empty branch commits.
> If there are large number of UnmergedBranches then read performance would degrade as
for determining revision validity currently logic scans all branches
> Below is some part of UnmergedBranch state
> {noformat}
> Branch 1
> 1 -> br146d2edb7a7-0-1 (true) (revision: "br146d2edb7a7-0-1", clusterId: 1, time:
"2014-06-25 05:08:52.903", branch: true)
> 2 -> br146d2f0450b-0-1 (true) (revision: "br146d2f0450b-0-1", clusterId: 1, time:
"2014-06-25 05:11:40.171", branch: true)
> Branch 2
> 1 -> br146d2ef1d08-0-1 (true) (revision: "br146d2ef1d08-0-1", clusterId: 1, time:
"2014-06-25 05:10:24.392", branch: true)
> Branch 3
> 1 -> br146d2ed26ca-0-1 (true) (revision: "br146d2ed26ca-0-1", clusterId: 1, time:
"2014-06-25 05:08:15.818", branch: true)
> 2 -> br146d2edfd0e-0-1 (true) (revision: "br146d2edfd0e-0-1", clusterId: 1, time:
"2014-06-25 05:09:10.670", branch: true)
> Branch 4
> 1 -> br146d2ecd85b-0-1 (true) (revision: "br146d2ecd85b-0-1", clusterId: 1, time:
"2014-06-25 05:07:55.739", branch: true)
> Branch 5
> 1 -> br146d2ec21a0-0-1 (true) (revision: "br146d2ec21a0-0-1", clusterId: 1, time:
"2014-06-25 05:07:08.960", branch: true)
> 2 -> br146d2ec8eca-0-1 (true) (revision: "br146d2ec8eca-0-1", clusterId: 1, time:
"2014-06-25 05:07:36.906", branch: true)
> Branch 6
> 1 -> br146d2eaf159-1-1 (true) (revision: "br146d2eaf159-1-1", clusterId: 1, time:
"2014-06-25 05:05:51.065", counter: 1, branch: true)
> Branch 7
> 1 -> br146d2e9a513-0-1 (true) (revision: "br146d2e9a513-0-1", clusterId: 1, time:
"2014-06-25 05:04:26.003", branch: true)
> {noformat}
> [~mreutegg] Suggested that these branch might be for those revision which have resulted
in a collision and upon checking it indeed appears to be the case  (value true in brackets
above indicate that). Further given the age of such revision it looks like they get populated
upon startup itself
> *Fix*
> * Need to check why we need to populate the UnermgedBranch
> * Possibly implement some purge job which would remove such stale entries 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message