kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ha...@apache.org
Subject [1/2] kudu git commit: KUDU-2320 apply exponential back-off while deleting replica
Date Mon, 12 Mar 2018 21:16:09 GMT
Repository: kudu
Updated Branches:
  refs/heads/master 29dbc7e41 -> 0c05e8375


KUDU-2320 apply exponential back-off while deleting replica

In some scenarios, the replica to remove might be on a tablet
server which hasn't yet registered with the master.  For example,
that happens when a tablet server where the replica had been hosted
went down and stays so when master is restarted.  Such a scenario
is exercised by RaftConsensusNonVoterITest::RestartClusterWithNonVoter.

I ran the RaftConsensusNonVoterITest::RestartClusterWithNonVoter
scenario before and after the fix.  Before the fix there was a steady
high rate of messages, and after the fix the rate of messages stated
following the exponential back-off pattern.

An example of the output before the fix:
  I0309 00:07:34.972404  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 13 ms (attempt =
0)
  W0309 00:07:34.972436  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:34.985633  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 28 ms (attempt =
0)
  W0309 00:07:34.985673  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:35.014024  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 26 ms (attempt =
0)
  W0309 00:07:35.014062  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:35.040323  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 19 ms (attempt =
0)
  W0309 00:07:35.040377  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:35.059588  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 50 ms (attempt =
0)
  W0309 00:07:35.059628  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b
Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a

An example of the output after the fix:
  I0308 22:36:59.251387  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 37 ms (attempt =
2)
  W0308 22:36:59.251437  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.288799  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 84 ms (attempt =
3)
  W0308 22:36:59.288851  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.373152  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 146 ms (attempt
= 4)
  W0308 22:36:59.373209  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.519738  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 267 ms (attempt
= 5)
  W0308 22:36:59.519806  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.787600  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 554 ms (attempt
= 6)
  W0308 22:36:59.787657  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:37:00.342607  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 1056 ms (attempt
= 7)
  W0308 22:37:00.342682  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:37:01.399219  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 2094 ms (attempt
= 8)
  W0308 22:37:01.399274  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset
TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:37:03.494213  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9
Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 4125 ms (attempt
= 9)

Change-Id: Ia12d261d7270aae7fafe877780b547d262aef16d
Reviewed-on: http://gerrit.cloudera.org:8080/9561
Reviewed-by: Todd Lipcon <todd@apache.org>
Tested-by: Alexey Serbin <aserbin@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/8aa75d8c
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/8aa75d8c
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/8aa75d8c

Branch: refs/heads/master
Commit: 8aa75d8cfced3a88f9ae2c5ed9bb4afaec15cec7
Parents: 29dbc7e
Author: Alexey Serbin <aserbin@cloudera.com>
Authored: Fri Feb 23 16:26:32 2018 -0800
Committer: Alexey Serbin <aserbin@cloudera.com>
Committed: Mon Mar 12 18:19:44 2018 +0000

----------------------------------------------------------------------
 src/kudu/master/catalog_manager.cc | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/8aa75d8c/src/kudu/master/catalog_manager.cc
----------------------------------------------------------------------
diff --git a/src/kudu/master/catalog_manager.cc b/src/kudu/master/catalog_manager.cc
index 2aa09af..1a76b12 100644
--- a/src/kudu/master/catalog_manager.cc
+++ b/src/kudu/master/catalog_manager.cc
@@ -2640,9 +2640,12 @@ Status RetryingTSRpcTask::Run() {
   rpc_.Reset();
   rpc_.set_deadline(deadline);
 
+  // Increment the counter of the attempts to run the task.
+  ++attempt_;
+
   Status s = ResetTSProxy();
   if (s.ok()) {
-    if (SendRequest(++attempt_)) {
+    if (SendRequest(attempt_)) {
       return Status::OK();
     }
   } else {


Mime
View raw message