kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a...@apache.org
Subject kudu git commit: KUDU-2088: Synchronizer may not go out of scope with outstanding references
Date Wed, 02 Aug 2017 01:37:26 GMT
Repository: kudu
Updated Branches:
  refs/heads/master 443a3512e -> c38631097

KUDU-2088: Synchronizer may not go out of scope with outstanding references

A common Kudu pattern is to declare a Synchronizer on the stack, create a
functor referencing it via AsStatusCallback, pass the functor into an
asynchronous function, wait until the functor is invoked (via Wait), then
allow the Synchronizer to go out of scope. This assumes that the async
function only invokes the StatusCallback once; a safe assumption in Kudu.

If using WaitFor instead of Wait, you must "fully wait" on the Synchronizer
before it goes out of scope. That is, WaitFor must be called repeatedly
until it no longer returns TimedOut, to ensure that whomever took the
reference has used it and will not use it again.

In this block of code, that invariant was violated. Namely, a failure in
SnoozeFailureDetector led to an early destruction of log_synchronizer,
causing a crash when the functor is invoked later. SnoozeFailureDetector
could fail if the node was elected leader while waiting on the Synchronizer.

Without the fix, raft_consensus-itest crashed in a particular way about one
or two percent of the time. With it, these crashes no longer manifest.

Change-Id: Icf9ce499616a872ad5ed4a27bd1e90e94fba5ff5
Reviewed-on: http://gerrit.cloudera.org:8080/7562
Reviewed-by: Todd Lipcon <todd@apache.org>
Tested-by: Kudu Jenkins

Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/c3863109
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/c3863109
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/c3863109

Branch: refs/heads/master
Commit: c38631097a466f50209e211218c6668789f4b445
Parents: 443a351
Author: Adar Dembo <adar@cloudera.com>
Authored: Tue Aug 1 17:15:34 2017 -0700
Committer: Adar Dembo <adar@cloudera.com>
Committed: Wed Aug 2 01:36:03 2017 +0000

 src/kudu/consensus/raft_consensus.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/kudu/consensus/raft_consensus.cc b/src/kudu/consensus/raft_consensus.cc
index cd0b9dc..1756931 100644
--- a/src/kudu/consensus/raft_consensus.cc
+++ b/src/kudu/consensus/raft_consensus.cc
@@ -1367,7 +1367,8 @@ Status RaftConsensus::UpdateReplica(const ConsensusRequestPB* request,
       // If just waiting for our log append to finish lets snooze the timer.
       // We don't want to fire leader election because we're waiting on our own log.
       if (s.IsTimedOut()) {
-        RETURN_NOT_OK(SnoozeFailureDetector());
+        WARN_NOT_OK(SnoozeFailureDetector(),
+                    "failed to snooze failure detector");
     } while (s.IsTimedOut());

View raw message