mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhq527...@gmail.com>
Subject Re: Review Request 72516: Erased `Info` struct before unmouting volumes in Docker volume isolator.
Date Tue, 26 May 2020 01:43:19 GMT


> On May 26, 2020, 1:57 a.m., Andrei Budnik wrote:
> > src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp
> > Lines 666 (patched)
> > <https://reviews.apache.org/r/72516/diff/1/?file=2231872#file2231872line666>
> >
> >     Since the isolator doesn’t remove the checkpoint directory on unmount failure
(`_cleanup`), wouldn’t the info entry get restored on the agent restart (`_recover`)?

Yes, the info entry will get restored on agent recovery, and since the container process has
already been killed, on agent recovery `MesosContainerizerProcess::reaped` will be called
for the container and it will call `MesosContainerizerProcess::destroy` to destroy the container,
so we get another chance to unmount the volume in `docker/volume` isolator.


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72516/#review220856
-----------------------------------------------------------


On May 26, 2020, 9:41 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72516/
> -----------------------------------------------------------
> 
> (Updated May 26, 2020, 9:41 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10126
>     https://issues.apache.org/jira/browse/MESOS-10126
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Currently when `DockerVolumeIsolatorProcess::cleanup()` is called, we will
> unmount the volume first, and if the unmount operation fails we will NOT
> erase the container's `Info` struct from `infos`. This is problematic
> because the remaining `Info` in `infos` will cause the reference count of
> the volume is greater than 0, but actually the volume is not being used by
> any containers. That means we may never get a chance to unmount this volume
> on this agent, furthermore if it is an EBS volume, it cannot be used by any
> tasks launched on any other agents since a EBS volume can only be attached
> to one node at a time. The only workaround would manually unmount the volume.
> 
> So in this patch `DockerVolumeIsolatorProcess::cleanup()` is updated to erase
> container's `Info` struct before unmounting volumes.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp c547696f50a4df9cce4ee9078b5fe90b93fd91d2

> 
> 
> Diff: https://reviews.apache.org/r/72516/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message