jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Egli (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-2682) Introduce time difference detection for DocumentNodeStore
Date Tue, 28 Jul 2015 14:33:06 GMT

     [ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefan Egli updated OAK-2682:
-----------------------------
    Attachment: OAK-2682.patch

oak-core [^OAK-2682.patch] attached which introduces:
* {{determineServerTimeDifferenceMillis()}}: @return the estimated time difference in milliseconds
between the local instance and the (typically common, shared) document server system.
** plus exposed via {{DocumentNodeStoreMBean}}
* the implementation of the above in {{MongoDocumentStore}} plus {{MemoryDocumentStore}} (the
latter is a trivial return 0)
* {{RDBDocumentStore}} currently throws a {{new UnsupportedOperationException()}}
* {{DocumentNodeStore}} does this check now at startup and refuses to start if more than default
of {{2000ms}} off (configurable via {{oak.documentMK.maxServerTimeDiffMillis}} system property)

/cc [~chetanm], [~mreutegg], [~reschke], pls review - I'd commit as soon as I get some positive
feedback

> Introduce time difference detection for DocumentNodeStore
> ---------------------------------------------------------
>
>                 Key: OAK-2682
>                 URL: https://issues.apache.org/jira/browse/OAK-2682
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mongomk
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>              Labels: resilience
>             Fix For: 1.3.5
>
>         Attachments: OAK-2682.patch
>
>
> Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption
that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for
60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen
to take couple seconds, you run the risk of timing out a lease. So introducing a check which
WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help
increase awareness. Further drastic measure could be to prevent a startup of Oak at all if
the difference is for example higher than a 2nd threshold (optional I guess, but could be
20sec?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message