jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Egli (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-2682) Introduce time difference detection for DocumentNodeStore
Date Tue, 28 Jul 2015 10:15:05 GMT

     [ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefan Egli updated OAK-2682:
-----------------------------
    Assignee: Stefan Egli

After discussing this with [~mduerig] the suggestion is to follow up on what was discussed
and as it looks agreed upon between [~mreutegg] and [~rombert]:
* DocumentStore implementations should expose an MBean function which determines the +time
difference between the local and the database server+ (in milliseconds): {{getServerTimeDifferentMillis()}}
* That MBean function could thus be used in some monitoring tool to react upon difference
growing above certain limits (perhaps with a lower 'warn'  and a higher 'panic' limit)
* Independent of monitoring however, the DocumentStore should at +startup apply an initial
check+ on this 'server-time-diff' to assert that the clocks are in sync at least initially.
The assumption is that clock speed differences are much less of a problem than initial time
difference. This, plus the fact that a server startup is usually an admin controlled activity,
the initial check can apply a rather dramatic limit (eg 2 seconds). Higher level monitoring
though can be slightly more generous and for example have a 2 sec warning- and a 5 second
panic limit. 

I'll follow up on the MongoDocumentStore part of this feature next. (RDBDocumentStore part
will be handled in separate ticket)

> Introduce time difference detection for DocumentNodeStore
> ---------------------------------------------------------
>
>                 Key: OAK-2682
>                 URL: https://issues.apache.org/jira/browse/OAK-2682
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mongomk
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>              Labels: resilience
>             Fix For: 1.3.5
>
>
> Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption
that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for
60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen
to take couple seconds, you run the risk of timing out a lease. So introducing a check which
WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help
increase awareness. Further drastic measure could be to prevent a startup of Oak at all if
the difference is for example higher than a 2nd threshold (optional I guess, but could be
20sec?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message