mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Pronin <ipro...@twopensource.com>
Subject Re: Review Request 54081: Added `--pidfile` option to master and agent binaries.
Date Wed, 01 Mar 2017 20:47:59 GMT


> On Dec. 6, 2016, 10:42 p.m., Benjamin Mahler wrote:
> > Thanks Ilya!
> > 
> > Have you looked at other pidfile related libraries? Looks like BSD provides some
functions for this (they're also available on Linux):
> > https://www.freebsd.org/cgi/man.cgi?query=pidfile&sektion=3&manpath=FreeBSD+6.1-RELEASE
> > https://linux.die.net/man/3/pidfile
> > 
> > It would be great to provide an os-agnostic library within stout for this, e.g.:
> > 
> > ```
> > pidfile::open()
> > pidfile::close()
> > pidfile::remove()
> > ```
> > 
> > I haven't looked too deeply at what we should have these take and return, but I
noticed the use of RAII seemed unnecessary in your patch (only the .cpp implementation uses
the exposed RAII class? If so, perhaps we can just leave RAII out for now since the callers
do not use it). Also it would be great to which process is already running (which is provided
by the BSD/Linux functions). Also, another suggestion as a first step here is to avoid Windows
support entirely for now if we don't implement the locking aspect of this (as that seems necessary
for this to be safely usable?)
> > 
> > We could structure the patches like this:
> > 
> > (1) Addition of pidfile utilities to stout.
> > (2) Tests of the pidfile utilities in stout (should be possible?)
> > (3) Integration of the pidfile utilities into mesos (manually test since the logic
will reside in the mains?)
> 
> Ilya Pronin wrote:
>     Thanks for the feedback!
>     
>     > Looks like BSD provides some functions for this (they're also available on Linux)
>     
>     Yes, I saw `libbsd` but since it's not part of POSIX I decided to make things portable.
>     
>     > Also it would be great to which process is already running
>     
>     Good idea! I will add it.
>     
>     > I noticed the use of RAII seemed unnecessary in your patch (only the .cpp implementation
uses the exposed RAII class? If so, perhaps we can just leave RAII out for now since the callers
do not use it).
>     
>     My intention was to try to be nice and clean up the PID file when daemon exits through
`std::exit()` or terminates on `SIGTERM`. For handling the first case I created a static RAII
object so its destructor will be called during `std::exit()` cleanup. We could place this
object in `main()`. But here comes the second case. Our daemons perform default `SIGTERM`
action which is `term`. There's a signal handler in `logging/logging.cpp` where we can unlink
the file. For that I introduced a helper function and placed the RAII object inside `common/pid_file.cpp`.
>     
>     So I think a RAII class is a good thing for implementing such thing. I'm OK with
moving it to stout and restructuring patches.
>     
>     Also I just thought that I should check if the destructor is executing inside the
process that created the PID file to protect from unlinking by forked processes.
>     
>     What do you think?
> 
> Alexander Rukletsov wrote:
>     A problem with the RAII class here is that we sometimes manually have to release
the lock (in case of a signal). When I look at this patch I read something like "Hey, you
don't have to close & remove the pidfile explicitly, we'll do that for you... well, unless
your process is signaled". I'm inclined to require users call `removePidFile()` explicitly
to avoid creaing a false feeling of safety.
>     
>     Another question: Why do you want to expose `PIDFile` class in the `.hpp`?
> 
> Ilya Pronin wrote:
>     Well, I wrote it more like "you can remove it manually if you want but if you just
forget about it nothing bad will happen" :) The lock will be released automatically when the
fd is closed (e.g. on process termination).
>     
>     I think current `SIGTERM` handling is an exceptional case here (even though I assume
it is the most frequent way the daemon terminates). Another such case could be calling `std::_Exit()`.
These are situations when usual cleanup is not performed and some manual actions may be required.
>     
>     Do we want to always do cleanup manually and explicitly? If this is the accepted
way in Mesos then of course RAII is not appliable here.
>     
>     > Why do you want to expose PIDFile class in the .hpp?
>     
>     For testing. Although there's only 1 simple test.
> 
> Benjamin Mahler wrote:
>     Ah ok, I see all of the cases you were looking to cover now!
>     
>     While we figure out if we can leverage RAII and a static global smart pointer, we
can start with a simple set of function utilities in stout (such that one could build the
RAII wrapper on top of them, if needed).
>     
>     For the static smart pointer, we currently follow Google's style guide rule to ban
"static non-POD variables" in order to avoid destructor ordering issues and access during/after
destruction issues:
>     https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables
>     
>     We could try to justify an exception here, but it seems that we could just acheive
the same behavior using `std::atexit` and `std::at_quick_exit`? Either way, unlinking seems
like an "optimization" in that the tooling will have to handle the case where SIGKILL is used
and the pidfile remains but has been unlocked. So maybe we could add the `std::atexit` and
`std::at_quick_exit` as a distinct patch on top of the "non-optimized" version.
> 
> Ilya Pronin wrote:
>     I agree. Using `std::at_quick_exit()` will even cover an additional case of `std::quick_exit()`.
>     
>     So with free utility functions we will have a design like this (everything is in
a namespace):
>     ```c++
>     // Contains path, FD, PID.
>     struct Handle;
>     
>     // Open, lock and write a file at the specified path.
>     Try<Handle> open(const string&);
>     
>     // Close the file.
>     Try<Nothing> close(const Handle&);
>     
>     // Close and remove the file.
>     Try<Nothing> remove(const Handle&);
>     ```
>     
>     Seems like we could still use a class with similar member functions here instead
of a C-style handle-struct?
> 
> Greg Mann wrote:
>     A class with member functions would seem to me to imply that the lifecycle of the
pid file is linked to the lifecycle of the class, and it sounds like that's something we want
to avoid at the moment?

OK. Sorry for the delay, got distracted by other stuff. Will send updated reviews shortly.


- Ilya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54081/#review158240
-----------------------------------------------------------


On Nov. 28, 2016, 12:16 p.m., Ilya Pronin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54081/
> -----------------------------------------------------------
> 
> (Updated Nov. 28, 2016, 12:16 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Vinod Kone.
> 
> 
> Bugs: MESOS-1648
>     https://issues.apache.org/jira/browse/MESOS-1648
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The PID file is created and kept locked while master / agent is running to prevent other
instances with the same PID file location from starting.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt d6e213686a44fbca0ed841594ce803151224a5a7 
>   src/Makefile.am dd1626d177b38a6613f18f32bb0668abbb5100e0 
>   src/common/pid_file.hpp PRE-CREATION 
>   src/common/pid_file.cpp PRE-CREATION 
>   src/logging/logging.cpp 70d66a5c396f709e8f27ad0d51315ed6d257f73b 
>   src/master/main.cpp fa7ba1310142a3bef71379ba37fded9b8390aae9 
>   src/slave/main.cpp 8010f8e229e2d820649750c9db0456ecd1b854d3 
>   src/tests/common/pid_file_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/54081/diff/2/
> 
> 
> Testing
> -------
> 
> Added test to verify that PID file is created upon `PIDFile` object creation and deleted
upon its destruction. Ran `make check`.
> 
> 
> Thanks,
> 
> Ilya Pronin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message