jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
Date Wed, 29 Jul 2015 06:40:06 GMT

    [ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645550#comment-14645550
] 

Chetan Mehrotra commented on OAK-2739:
--------------------------------------

[~egli] I am still not clear on the severity of this issue. Currently background lease is
updated periodically (every 1 sec) by a dedicated thread which just perform a single operation
and not much. So even if there are issues in other parts this thread would continue to work
(which might be wrong) and still update the lease every 1 sec.

So to me lease update does not look like an operation which would take long time and cause
above mentioned issues. May be I am missing something here

> take appropriate action when lease cannot be renewed (in time)
> --------------------------------------------------------------
>
>                 Key: OAK-2739
>                 URL: https://issues.apache.org/jira/browse/OAK-2739
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: mongomk
>    Affects Versions: 1.2
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>              Labels: resilience
>             Fix For: 1.3.5
>
>
> Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its lease (ClusterNodeInfo.renewLease()),
this will be eventually noticed by the others in the same oak-cluster. Those then mark this
client as {{inactive}} and start recoverying and subsequently removing that node from any
further merge etc operation.
> Now, whatever the reason was why that client stopped renewing the lease (could be an
exception, deadlock, whatever) - that client itself still considers itself as {{active}} and
continues to take part in the cluster action.
> This will result in a unbalanced situation where that one client 'sees' everybody as
{{active}} while the others see this one as {{inactive}}.
> If this ClusterNodeInfo state should be something that can be built upon, and to avoid
any inconsistency due to unbalanced handling, the inactive node should probably retire gracefully
- or any other appropriate action should be taken, other than just continuing as today.
> This ticket is to keep track of ideas and actions taken wrt this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message