jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Assigned] (OAK-5464) Improve the transaction rate of the TarMK
Date Wed, 06 Dec 2017 10:25:00 GMT

     [ https://issues.apache.org/jira/browse/OAK-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Michael Dürig reassigned OAK-5464:

    Assignee: Michael Dürig

> Improve the transaction rate of the TarMK
> -----------------------------------------
>                 Key: OAK-5464
>                 URL: https://issues.apache.org/jira/browse/OAK-5464
>             Project: Jackrabbit Oak
>          Issue Type: Epic
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: scalability
>             Fix For: 1.8
> The TarMK's write throughput is limited by the way concurrent commits are processed:
rebasing and running the commit hooks happen within a lock without any explicit scheduling.
This epic covers improving the overall transaction rate. The proposed approach would roughly
be to first make scheduling of transactions explicit, then add monitoring on transaction to
gather a better understanding and then experiment and implement explicit scheduling strategies
to optimise particular aspects. 
> h2. Summary of ideas mentioned in an offline sessions
> h3. Advantages of explicit scheduling:
> * Control over (order) of commits
> * Sophisticated monitoring (commit statistics, e.g. commit rate, time in queue, etc.)

> * Favour certain commits (e.g. checkpoints)
> * Reorder commits to simplify rebasing
> * Suspend the compactor on concurrent commits and have it resume where it left off afterwards
> * Parallelise certain commits (e.g. by piggy backing)
> * Implement a concurrent commit editor. we'd need to take care of proper access to the
shared state; [~frm] maybe introduce the idea of a common context to enforce concurrent access
> h3. Scheduler Implementation
> * Expedite
> * Prioritise
> * Defer
> * Collapse
> * Coalesce
> * Parallelise
> * Piggy back: can we piggy back commits on top of each other? The idea would be while
processing the changes of one commit to also check them for conflicts with the changes of
other commits waiting to commit. If a conflict is detected there, that other commit can immediately
be failed (given the current commit doesn't fail).
> * Merging non conflicting commits. Given multiple transactions ready to commit at the
same time. Can we process them as one (given they don't conflict) instead of one after each
other, which requires rebasing the later transaction to be rebase on the former.
> * Shield the file store from {{InterruptedException}} because of thread boundaries introduced
> * Implement tests, benchmarks and fixtures for verification

This message was sent by Atlassian JIRA

View raw message