mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Rukletsov <ruklet...@gmail.com>
Subject Re: Review Request 61575: Added a test for verifying signal escalation on the default executor.
Date Tue, 22 Aug 2017 10:21:45 GMT


> On Aug. 22, 2017, 9:02 a.m., Alexander Rukletsov wrote:
> > src/tests/kill_policy_test_helper.cpp
> > Lines 61-64 (original), 62-68 (patched)
> > <https://reviews.apache.org/r/61575/diff/4/?file=1798824#file1798824line62>
> >
> >     Good idea checkpointing the ready state! Can we also checkpoint when we get
`SIGTERM` and `SIGKILL`? Or this is tricky due to async signal safety?

I mean, we cannot catch `SIGKILL`, but we can touch a file right before we exit: the absence
of this file would hint that the process might have been killed.

Regarding signal safety, strictly speaking our `SIGTERM` handler here is not safe, but we
can workaround it I think. Here is one way we can rewrite it:
```
const char KillPolicyTestHelper::NAME[] = "KillPolicy";

const char KillPolicyTestHelper::RUNNING_MARKER_FILENAME[] =
  "kill-policy-helper-running";

const char KillPolicyTestHelper::GOT_SIGTERM_MARKER_FILENAME[] =
  "kill-policy-helper-got-sigterm";

const char KillPolicyTestHelper::TERMINATING_MARKER_FILENAME[] =
  "kill-policy-helper-terminating";

static bool sigtermCaught;


void sigtermHandler(int signum)
{
  // Ignore SIGTERM.
  sigtermCaught = true;
}


KillPolicyTestHelper::Flags::Flags()
{
  add(&Flags::sleep_duration,
      "sleep_duration",
      "Number of seconds for which the helper will stay alive after having "
      "received a SIGTERM signal.");
}


// This test helper blocks until it receives a SIGTERM, then sleeps
// for a configurable amount of time before finally returning EXIT_SUCCESS.
int KillPolicyTestHelper::execute()
{
  sigtermCaught = false;

  // Setup the signal handler.
  struct sigaction action;
  memset(&action, 0, sizeof(struct sigaction));
  action.sa_handler = sigtermHandler;
  sigaction(SIGTERM, &action, nullptr);

  os::touch(RUNNING_MARKER_FILENAME);

  // Block the process until we get SIGTERM.
  do {
    pause();
  } while (!sigtermCaught);

  std::cerr << "Received SIGTERM" << std::endl;
  os::touch(GOT_SIGTERM_MARKER_FILENAME);

  os::sleep(Seconds(flags.sleep_duration));
  os::touch(TERMINATING_MARKER_FILENAME);

  return EXIT_SUCCESS;
}
```


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61575/#review183452
-----------------------------------------------------------


On Aug. 21, 2017, 9:26 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61575/
> -----------------------------------------------------------
> 
> (Updated Aug. 21, 2017, 9:26 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Gastón Kleiman, Jie Yu, and Vinod Kone.
> 
> 
> Bugs: MESOS-7879
>     https://issues.apache.org/jira/browse/MESOS-7879
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This test uses the kill policy helper and blocks the SIGTERM signal.
> 
> Review: https://reviews.apache.org/r/61575
> 
> 
> Diffs
> -----
> 
>   src/tests/default_executor_tests.cpp afe0afabf784fb65eb833beadd3c584722c321e1 
>   src/tests/kill_policy_test_helper.hpp 29651102ec46b477e6e797c6e6bdef5b10afa665 
>   src/tests/kill_policy_test_helper.cpp a1880595ff015475f1ba49437d49f7397da19422 
> 
> 
> Diff: https://reviews.apache.org/r/61575/diff/4/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message