jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Commented] (OAK-7854) Add liveliness monitoring for FileStore background operations
Date Tue, 23 Oct 2018 12:18:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660548#comment-16660548
] 

Michael Dürig commented on OAK-7854:
------------------------------------

Third variant: [https://github.com/mduerig/jackrabbit-oak/commit/d33022c38ccdfa90a93b84642f907d4255aadfa2|https://github.com/mduerig/jackrabbit-oak/commit/d33022c38ccdfa90a93b84642f907d4255aadfa2.]
this time using a timer, which allows for monitoring the flush rate along with statistics
about the flush duration.

[~frm], I think this is what we should be doing.

> Add liveliness monitoring for FileStore background operations  
> ---------------------------------------------------------------
>
>                 Key: OAK-7854
>                 URL: https://issues.apache.org/jira/browse/OAK-7854
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>            Priority: Major
>             Fix For: 1.10
>
>
> The FileStore background operations are ultimately executed through a {{ScheduledExecutorService}}.
In the case this scheduling gets blocked (e.g. because of a deadlock or lock contention in
one of its tasks) there is chance of repository corruption. 
> To minimise potential data loss we should implement monitoring endpoints for the vital
background operations. This would allow deployments to take action early in case of failures
and thus minimise potential data loss and simplify recovery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message