mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Rukletsov <ruklet...@gmail.com>
Subject Re: Review Request 55901: Added support for command health checks to the default executor.
Date Wed, 08 Feb 2017 18:31:53 GMT


> On Jan. 28, 2017, 1:17 a.m., Vinod Kone wrote:
> > src/checks/health_checker.cpp, lines 426-429
> > <https://reviews.apache.org/r/55901/diff/1/?file=1613998#file1613998line426>
> >
> >     why a `.repair()` here?
> 
> Gastón Kleiman wrote:
>     To fail the future/check with a nice descriptive message that will later be logged.
> 
> Vinod Kone wrote:
>     The message returned by `http::connect` should be good enough? Do we use this pattern
elsewhere esp. with connect? Most uses of repair I have seen in the code base, transform the
future not really to add extra logging information.
> 
> Alexander Rukletsov wrote:
>     We use `.repair` for adjusting the message or changing the error type, which is technically
the same. Here are some examples from the code:
>     https://github.com/apache/mesos/blob/25f4feae487d53a701adb787fd8a2e5f6166b789/3rdparty/libprocess/src/http.cpp#L1766-L1769
>     https://github.com/apache/mesos/blob/25f4feae487d53a701adb787fd8a2e5f6166b789/src/master/http.cpp#L4695-L4697
>     https://github.com/apache/mesos/blob/25f4feae487d53a701adb787fd8a2e5f6166b789/src/slave/containerizer/mesos/io/switchboard.cpp#L1632-L1638
>     
>     Now, the question is whether connection failure message is good enough? Gastón,
could you trigger a failure path and check what error message will be returned?
> 
> Gastón Kleiman wrote:
>     This is how it looks like with the repair:
>     
>     ```
>     W0207 10:42:19.659122  9361 health_checker.cpp:314] Health check failed 1 times consecutively:
COMMAND health check failed: Unable to establish connection with the agent: Failed to connect
to 192.99.40.208:31338: Connection refused
>     ```
>     
>     Without the repair it would look like this:
>     
>     ```
>     W0207 10:42:19.659122  9361 health_checker.cpp:314] Health check failed 1 times consecutively:
COMMAND health check failed: Failed to connect to 192.99.40.208:31338: Connection refused
>     ```

I'd vote for keeping `.repair`.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55901/#review163363
-----------------------------------------------------------


On Feb. 8, 2017, 1:27 p.m., Gastón Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55901/
> -----------------------------------------------------------
> 
> (Updated Feb. 8, 2017, 1:27 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Anand Mazumdar, haosdent huang, and Vinod
Kone.
> 
> 
> Bugs: MESOS-6280
>     https://issues.apache.org/jira/browse/MESOS-6280
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added support for command health checks to the default executor.
> 
> 
> Diffs
> -----
> 
>   src/checks/health_checker.hpp 95da1ff7dd6b222a93076633eb3757ec9aa43cf6 
>   src/checks/health_checker.cpp 58380dc18896f659aa9c4fb4bb567a55bba97f6b 
>   src/launcher/default_executor.cpp e63cf153831088851863d0956455a024e9bc172a 
>   src/tests/health_check_tests.cpp 7b6a803a28b2e4f6c27e9a0c4f668350ec2d5a81 
> 
> Diff: https://reviews.apache.org/r/55901/diff/
> 
> 
> Testing
> -------
> 
> Introduced a new test: `HealthCheckTest.DefaultExecutorCmdHealthCheck`. It passes on
Linux, but not on macOS.
> 
> 
> Thanks,
> 
> Gastón Kleiman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message