kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a...@apache.org
Subject [kudu] 02/03: Deflake ClientTest.TestServerTooBusyRetry
Date Wed, 01 May 2019 05:20:51 GMT
This is an automated email from the ASF dual-hosted git repository.

adar pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit d4f16fc6bc3df2afa49813d9ce73f6b7233648aa
Author: Will Berkeley <wdberkeley@gmail.com>
AuthorDate: Tue Apr 30 15:15:03 2019 -0700

    Deflake ClientTest.TestServerTooBusyRetry
    ClientTest.TestServerTooBusyRetry is a mess of a test. In TSAN mode,
    there are less rows inserted, so scans require less round trips to
    complete, but at the same time threads start slower, so the number of
    scans in-flight at once will tend to be lower. This causes the test to
    occasionally fail to cause a service queue overflow, as it is intended
    to do. Eventually, the test fails because TSAN has an upper bound on the
    number of threads that can be created in the lifetime of a single TSAN
    process, and the test slowly creates scan threads.
    This patch attempts to address the problem by raising the scan batch
    latency in TSAN mode. With this patch, I saw 0 failures in 1000 runs.
    Without it, I got tired of waiting for 850/1000 to finish after 15
    This is a quick fix. In the future someone should consider a more
    serious rewrite of this test.
    Change-Id: Id4d2ee077e9d107fb475c399af5690084bdeef49
    Reviewed-on: http://gerrit.cloudera.org:8080/13200
    Reviewed-by: Adar Dembo <adar@cloudera.com>
    Tested-by: Kudu Jenkins
 src/kudu/client/client-test.cc | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/kudu/client/client-test.cc b/src/kudu/client/client-test.cc
index 117a8c3..8547b89 100644
--- a/src/kudu/client/client-test.cc
+++ b/src/kudu/client/client-test.cc
@@ -5251,7 +5251,11 @@ TEST_F(ClientTest, TestServerTooBusyRetry) {
   // Introduce latency in each scan to increase the likelihood of
+  FLAGS_scanner_inject_latency_on_each_batch_ms = 100;
   FLAGS_scanner_inject_latency_on_each_batch_ms = 10;
   // Reduce the service queue length of each tablet server in order to increase
   // the likelihood of ERROR_SERVER_TOO_BUSY.

View raw message