kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ale...@apache.org
Subject [2/3] kudu git commit: [tablet] fix nullptr dereference while capturing iterators
Date Fri, 09 Mar 2018 22:38:20 GMT
[tablet] fix nullptr dereference while capturing iterators

Added a check into Tablet::CaptureConsistentIterators() to make sure
the tablet is not stopped/shutdown.

Before this patch in one test scenario I saw stack traces
like below (built in DEBUG configuration):

kudu-tserver: src/kudu/gutil/ref_counted.h:284: T *scoped_refptr<kudu::tablet::TabletComponents>::operator->()
const [T = kudu::tablet::TabletComponents]: Assertion `ptr_ != __null' failed.
*** Aborted at 1517534012 (unix time) try "date -d @1517534012" if you are using GNU date
***
PC: @     0x7ff9ad39cc37 gsignal
*** SIGABRT (@0x3e80000745f) received by PID 29791 (TID 0x7ff99a0bc700) from PID 29791; stack
trace: ***
    @     0x7ff9b5129330 (unknown) at ??:0
    @     0x7ff9ad39cc37 gsignal at ??:0
    @     0x7ff9ad3a0028 abort at ??:0
    @     0x7ff9ad395bf6 (unknown) at ??:0
    @     0x7ff9ad395ca2 __assert_fail at ??:0
    @     0x7ff9b7f2ce52 scoped_refptr<>::operator->() at ??:0
    @     0x7ff9b7f1bf6d kudu::tablet::Tablet::CaptureConsistentIterators() at ??:0
    @     0x7ff9b7f225f6 kudu::tablet::Tablet::Iterator::Init() at ??:0
    @     0x7ff9b94372e3 kudu::tserver::TabletServiceImpl::HandleNewScanRequest() at ??:0
    @     0x7ff9b943a906 kudu::tserver::TabletServiceImpl::Checksum() at ??:0
    @     0x7ff9b3d3c83d kudu::tserver::TabletServerServiceIf::TabletServerServiceIf()::$_11::operator()()
at ??:0
    @     0x7ff9b3d3c682 std::_Function_handler<>::_M_invoke() at ??:0
    @     0x7ff9b2ea026b std::function<>::operator()() at ??:0
    @     0x7ff9b2e9fb2d kudu::rpc::GeneratedServiceIf::Handle() at ??:0
    @     0x7ff9b2ea1ee6 kudu::rpc::ServicePool::RunThread() at ??:0
    @     0x7ff9b2ea4499 boost::_mfi::mf0<>::operator()() at ??:0
    @     0x7ff9b2ea4400 boost::_bi::list1<>::operator()<>() at ??:0
    @     0x7ff9b2ea43aa boost::_bi::bind_t<>::operator()() at ??:0
    @     0x7ff9b2ea418d boost::detail::function::void_function_obj_invoker0<>::invoke()
at ??:0
    @     0x7ff9b2e45f68 boost::function0<>::operator()() at ??:0
    @     0x7ff9b115162d kudu::Thread::SuperviseThread() at ??:0
    @     0x7ff9b5121184 start_thread at ??:0
    @     0x7ff9ad463ffd clone at ??:0
    @                0x0 (unknown)

I used the following WIP stress test for the reproduction scenario:
  https://gerrit.cloudera.org/#/c/9255/

For DEBUG builds, without fix the issues appeared ~0.5% of cases.  After
the fix, the issue could not be reproduced:

Without fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1518492521.137030

With fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1518492937.141401

Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Reviewed-on: http://gerrit.cloudera.org:8080/9189
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <mpercy@apache.org>
(cherry picked from commit 5d10a56f9d06dc695f2a4469edbabce978912eb4)
Reviewed-on: http://gerrit.cloudera.org:8080/9549
Tested-by: Alexey Serbin <aserbin@cloudera.com>
Reviewed-by: Todd Lipcon <todd@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/c1fd2bc4
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/c1fd2bc4
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/c1fd2bc4

Branch: refs/heads/branch-1.6.x
Commit: c1fd2bc428f0778e865739d48cdbc5e828fda6ca
Parents: e605c16
Author: Alexey Serbin <aserbin@cloudera.com>
Authored: Thu Feb 1 19:21:07 2018 -0800
Committer: Alexey Serbin <aserbin@cloudera.com>
Committed: Fri Mar 9 20:06:00 2018 +0000

----------------------------------------------------------------------
 src/kudu/tablet/tablet.cc | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/c1fd2bc4/src/kudu/tablet/tablet.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet.cc b/src/kudu/tablet/tablet.cc
index 29596b2..293a0a4 100644
--- a/src/kudu/tablet/tablet.cc
+++ b/src/kudu/tablet/tablet.cc
@@ -1722,21 +1722,23 @@ Status Tablet::DebugDump(vector<string> *lines) {
 }
 
 Status Tablet::CaptureConsistentIterators(
-  const Schema *projection,
-  const MvccSnapshot &snap,
-  const ScanSpec *spec,
-  OrderMode order,
-  vector<shared_ptr<RowwiseIterator> > *iters) const {
+    const Schema* projection,
+    const MvccSnapshot& snap,
+    const ScanSpec* spec,
+    OrderMode order,
+    vector<shared_ptr<RowwiseIterator>>* iters) const {
+
   shared_lock<rw_spinlock> l(component_lock_);
+  RETURN_IF_STOPPED_OR_CHECK_STATE(kOpen);
 
   // Construct all the iterators locally first, so that if we fail
   // in the middle, we don't modify the output arguments.
-  vector<shared_ptr<RowwiseIterator> > ret;
+  vector<shared_ptr<RowwiseIterator>> ret;
 
   // Grab the memrowset iterator.
   gscoped_ptr<RowwiseIterator> ms_iter;
   RETURN_NOT_OK(components_->memrowset->NewRowIterator(projection, snap, order, &ms_iter));
-  ret.push_back(shared_ptr<RowwiseIterator>(ms_iter.release()));
+  ret.emplace_back(ms_iter.release());
 
   // Cull row-sets in the case of key-range queries.
   if (spec != nullptr && spec->lower_bound_key() && spec->exclusive_upper_bound_key())
{
@@ -1754,7 +1756,7 @@ Status Tablet::CaptureConsistentIterators(
       RETURN_NOT_OK_PREPEND(rs->NewRowIterator(projection, snap, order, &row_it),
                             Substitute("Could not create iterator for rowset $0",
                                        rs->ToString()));
-      ret.push_back(shared_ptr<RowwiseIterator>(row_it.release()));
+      ret.emplace_back(row_it.release());
     }
     ret.swap(*iters);
     return Status::OK();
@@ -1767,7 +1769,7 @@ Status Tablet::CaptureConsistentIterators(
     RETURN_NOT_OK_PREPEND(rs->NewRowIterator(projection, snap, order, &row_it),
                           Substitute("Could not create iterator for rowset $0",
                                      rs->ToString()));
-    ret.push_back(shared_ptr<RowwiseIterator>(row_it.release()));
+    ret.emplace_back(row_it.release());
   }
 
   // Swap results into the parameters.


Mime
View raw message