mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhq527...@gmail.com>
Subject Re: Review Request 65518: Reaped the container process directly in Docker executor.
Date Mon, 12 Feb 2018 03:45:30 GMT


> On Feb. 10, 2018, 8:55 a.m., Vinod Kone wrote:
> > src/docker/executor.cpp
> > Lines 277 (patched)
> > <https://reviews.apache.org/r/65518/diff/2/?file=1953359#file1953359line277>
> >
> >     s/never returns though/to never return although/

I see Greg suggests to change `returns` to `returning` in his comment below which I think
is better.


> On Feb. 10, 2018, 8:55 a.m., Vinod Kone wrote:
> > src/docker/executor.cpp
> > Lines 280 (patched)
> > <https://reviews.apache.org/r/65518/diff/2/?file=1953359#file1953359line280>
> >
> >     when can this be None()?

According to this comment https://github.com/apache/mesos/blob/1.5.0/src/docker/docker.hpp#L104:L106,
it will be `None()` when the container is not running.

But I am not sure if it will be `None()` when the container is not running AND the Docker
issue (https://github.com/moby/moby/issues/33820) occurs. I mean if we launch a Docker container
which exits immediately (e.g., execute the command `exit 0`), and due to that Docker issue
Docker daemon does not catch the container's exit, will `container.pid` be `None()` or not?
If it is not `None()` in this case, then we will reap the process in this lambda which is
good, but if it is `None()`, then we will miss to reap the process which is not correct. My
suspect is it will not be `None` in this case, but just to be safe, let's also do the below
in the case that `container.pid` is `None()`, how do you think?
```
delay(
    Seconds(3),
    self(),
    &Self::reapedContainer,
    container.pid.get());
```


> On Feb. 10, 2018, 8:55 a.m., Vinod Kone wrote:
> > src/docker/executor.cpp
> > Lines 288 (patched)
> > <https://reviews.apache.org/r/65518/diff/2/?file=1953359#file1953359line288>
> >
> >     Add a LOG line here?

If we add a LOG here, then we may need to add a LOG into `reaped` too? They are the two methods
to catch the exit status of the container.


> On Feb. 10, 2018, 8:55 a.m., Vinod Kone wrote:
> > src/docker/executor.cpp
> > Lines 530 (patched)
> > <https://reviews.apache.org/r/65518/diff/2/?file=1953359#file1953359line530>
> >
> >     why not call `reaped` directly? sounds like `reaped` does a bunch of necessary
cleanup that you are skipping by calling `_reaped`?

Agree, thanks for catching it!


> On Feb. 10, 2018, 8:55 a.m., Vinod Kone wrote:
> > src/docker/executor.cpp
> > Line 490 (original), 535 (patched)
> > <https://reviews.apache.org/r/65518/diff/2/?file=1953359#file1953359line535>
> >
> >     shouldn't we return here if `terminated` already is set in case the `run` returns
after `reapedContainer` is called?

Yes, we should return here if we call `reaped` instead of `_reaped` in `reapedContainer` as
you mentioned in the above comment.


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65518/#review197206
-----------------------------------------------------------


On Feb. 9, 2018, 9:03 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65518/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2018, 9:03 a.m.)
> 
> 
> Review request for mesos, Gaston Kleiman, Gilbert Song, Greg Mann, and Vinod Kone.
> 
> 
> Bugs: MESOS-8488
>     https://issues.apache.org/jira/browse/MESOS-8488
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Due to a Docker issue (https://github.com/moby/moby/issues/33820),
> Docker daemon will fail to catch a container exit, i.e., the container
> process has already exited but the command `docker ps` shows the
> container still running, this will lead to the "docker run" command
> that we execute in Docker executor never returns, and it will also
> cause the `docker stop` command takes no effect, i.e., it will return
> without error but `docker ps` shows the container still running, so
> the task will stuck in `TASK_KILLING` state.
> 
> To workaround this Docker issue, in this patch we made Docker executor
> reaps the container process directly so Docker executor will be notified
> once the container process exits.
> 
> 
> Diffs
> -----
> 
>   src/docker/executor.cpp e4c53d558e414e50b1c429fba8e31e504c63744a 
> 
> 
> Diff: https://reviews.apache.org/r/65518/diff/2/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message