mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 51299: Fixed memory leak in master during framework teardown.
Date Mon, 03 Oct 2016 20:54:35 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51299/
-----------------------------------------------------------

(Updated Oct. 3, 2016, 1:54 p.m.)


Review request for mesos, Anand Mazumdar, Greg Mann, and Artem Harutyunyan.


Changes
-------

Reword some comments about the prior behavior (of not discarding) to mention how the future
is discarded in this case.


Repository: mesos


Description
-------

The `process::http::Pipe` class leaks its underlying `shared_ptr`
due to how the master (accidentally) causes the `shared_ptr` to hold
a self-reference.

When the master receives a `SUBSCRIBE` call from an V1 HTTP API
framework, the master creates a `process::http::Pipe` to manage the
reading and writing in separate locations in the code.  For 
synchronization, the read/write ends of the HTTP connection share
some state (via `shared_ptr`).

The master introduces a self-reference via this line in
`Master::addFramework`:
```
  http.closed()
    .onAny(defer(self(), &Self::exited, framework->id(), http));
```

`http.closed()` returns a future managed by the read-end of the `Pipe`.
`http` holds a copy of the write-end of the `Pipe`, which in turn has
a copy of the `shared_ptr`.  Because `http` is passed into the future's 
continuation, a copy of `http` will live inside the read-end's future 
until the either (a) the continuation is executed or (b) the future 
is destructed.

Due to how we manage the future, neither (a) nor (b) are performed:
(a) When the read-end of the `Pipe` is closed, we only complete the
    future if the write-end of the `Pipe` is still open.  During
    framework teardown, the write-end is closed first.
(b) The future in question lives inside a `Promise`, which lives
    inside the `shared_ptr`.  Because the self-reference exists, the
    `shared_ptr` is never destructed; which means the `Promise` and
    future are never destructed either.

---

This patch fixes the leak by making sure the continuation is always
executed (a) or discarded.  It does so by discarding the related 
future when the write-end of the `Pipe` is already closed.


Diffs (updated)
-----

  3rdparty/libprocess/include/process/http.hpp 404196bb198c1ff958b55d72fb29c5fe92dba429 
  3rdparty/libprocess/src/http.cpp 298bd460afdeccd18e201d8a505b961fd4cf3d3e 
  3rdparty/libprocess/src/tests/http_tests.cpp 2538f56c88f1c6074a0d41182a2242a6fdd105f4 

Diff: https://reviews.apache.org/r/51299/diff/


Testing
-------

Found this leak in collaboration with Greg :)

Ran valgrind before applying this patch:
```
LD_RUN_PATH=/path/to/mesos/build/src/.libs LD_LIBRARY_PATH=/path/to/mesos/build/src/.libs
valgrind --leak-check=full src/.libs/mesos-tests --gtest_filter="*SchedulerTest.Teardown*"
```

Observed the following leak (among many false positives):
```
1,881 (216 direct, 1,665 indirect) bytes in 1 blocks are definitely lost in loss record 2,405
of 2,507
   at 0x4C2A105: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0xF259A9: process::http::Pipe::Pipe() (http.hpp:414)
   by 0x5D629A1: mesos::internal::master::Master::Http::scheduler(process::http::Request const&,
Option<std::string> const&) const (http.cpp:796)
   by 0x5DB5806: operator() (master.cpp:858)
   ...
```

Applied the patch and re-ran valgrind.  Confirmed that leak is gone.


Thanks,

Joseph Wu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message