jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Egli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-2682) Introduce time difference detection for DocumentNodeStore
Date Mon, 10 Aug 2015 12:17:45 GMT

    [ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680020#comment-14680020
] 

Stefan Egli commented on OAK-2682:
----------------------------------

bq. So are you asking for the difference between the system clocks of the local machine and
the machine on which the database runs?
yes, exactly. The (simple) cluster time difference detection is as follows:
* each machine checks to have the clock within a certain margin in sync with the server
* if one machine does not: it shall complain very loudly and stop functioning
* given the above all machines have their clocks in sync with a maximum offset of 2 * margin_with_server.
* which gives us an easy enough cluster-internal time difference detection (without going
into too much trying to be 'NTP-style')

> Introduce time difference detection for DocumentNodeStore
> ---------------------------------------------------------
>
>                 Key: OAK-2682
>                 URL: https://issues.apache.org/jira/browse/OAK-2682
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mongomk
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>              Labels: resilience
>             Fix For: 1.3.5
>
>         Attachments: OAK-2682.patch
>
>
> Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption
that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for
60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen
to take couple seconds, you run the risk of timing out a lease. So introducing a check which
WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help
increase awareness. Further drastic measure could be to prevent a startup of Oak at all if
the difference is for example higher than a 2nd threshold (optional I guess, but could be
20sec?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message