mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guangya Liu <gyliu...@gmail.com>
Subject Re: Review Request 50123: Added GPU scheduler for docker containerizer.
Date Mon, 01 Aug 2016 13:53:41 GMT


> On 七月 27, 2016, 7:55 a.m., Guangya Liu wrote:
> > src/slave/containerizer/docker.cpp, lines 1550-1561
> > <https://reviews.apache.org/r/50123/diff/1/?file=1445669#file1445669line1550>
> >
> >     Why release the gpu resources here? The `update` will also be called when the
executor is launched.
> 
> Yubo Li wrote:
>     This logic is called when the task terminates. We allocate GPU in `Future<Nothing>
DockerContainerizerProcess::update()` when a task launches, and release GPU in `Future<Nothing>
DockerContainerizerProcess::_update()` when a task terminates.
> 
> Guangya Liu wrote:
>     I think that when task termiates, the `DockerContainerizerProcess::destroy` will
be called.
> 
> Yubo Li wrote:
>     Yes, but `DockerContainerizerProcess::destroyed` is called behind `gpus` resource
released. If `gpus` released, new task will get resource and try to run on this node, but
in this time, the GPU allocator may be not release resources. In this case, the task will
hang.
>     
>     Will release GPUs in `Future<Nothing> DockerContainerizerProcess::_update()`
when `container.pid.isNone()`, which indicates that this update refers to container destruction.

Thanks Yu Bo. Yes, we should have those two places to release the GPU resources: 1) When the
task was exit normally. 2) When the task was killed.


- Guangya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50123/#review143692
-----------------------------------------------------------


On 七月 29, 2016, 5:42 a.m., Yubo Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50123/
> -----------------------------------------------------------
> 
> (Updated 七月 29, 2016, 5:42 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Kevin Klues, and Rajat Phull.
> 
> 
> Bugs: MESOS-5795
>     https://issues.apache.org/jira/browse/MESOS-5795
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This added 'NvidiaGpuAllocator' to docker containerizer so that the
> docker containerizer can use it to allocate GPUs to the task with 'gpus'
> resource. Also, allocated GPUs will automatically deallocated after the
> job destroyed.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/docker.hpp 43ca4317d608b3b43dd7bd0d1b55c721e7364885 
>   src/slave/containerizer/docker.cpp 12bad2db03bcf755317c654f028b628c5c407a62 
>   src/tests/mesos.hpp 51c66f175c80ebacd5af230222ea7e4c81dfc1e8 
>   src/tests/mesos.cpp 30492d7e3b4c5e9ae9d2b2446cadba62d43a3c65 
> 
> Diff: https://reviews.apache.org/r/50123/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yubo Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message