mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From haosdent huang <haosd...@gmail.com>
Subject Re: Review Request 46158: Completed implementation of the cgroups unified isolator.
Date Sat, 25 Jun 2016 18:57:33 GMT


> On June 12, 2016, 2:51 p.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp, lines 351-363
> > <https://reviews.apache.org/r/46158/diff/2/?file=1347650#file1347650line351>
> >
> >     Why do we prepare hierarchy here? I think preparing hierarchy for a specific
container is a low-level work which should be handled in `Subsystem::prepare()` method rather
than here. And in that way, we only need one `for` loop (the following one) rather than 2
here.
> 
> haosdent huang wrote:
>     We delegate the create/destory operations of hierarchy to `CgroupsIsolatorProcess`.
Because if we do the create/destroy operations of hierarchy in `Subsystem::prepare()` and
`Subsystem::cleanup()`, it would create the hierarchy or destroy the hierarchy multiple times
when differnt subsystems mount in same hierarchies.
> 
> Qian Zhang wrote:
>     My idea is, here we should call each subsystem's `prepare()` (the second `for` loop
in this method), and each subsystem's `prepare()` should call its parent class's `prepare()`
method (i.e., `Subsystem::prepare()`), and in `Subsystem::prepare()`, we call the virtual
method `name()` first to know which subsystem that we are working on, and then ONLY create
cgroup for that subsytem. In this way, we keep the common cgroup operations (e.g., create
cgroup) only in parent class (e.g., `Subsystem::prepare()`), and we can still do some subsystem
specific works in child class (e.g., `MemorySubsystem::prepare()`).
>     
>     Currently, `Subsystem::prepare()`, `Subsystem::isolate()` and `Subsystem::cleanup()`
are just no-op, I think we should fully utilize them.
> 
> haosdent huang wrote:
>     Yes, but it would create/destroy the hierarchy multiple times if we put these operations
to `Subsystem`.
> 
> Qian Zhang wrote:
>     What did you mean about "create/destroy the hierachy"? Did you mean creating `/sys/fs/cgroup/<subsystem>/mesos`?
If yes, I think it is a one-time operation which will be only done in `CgroupsIsolatorProcess::create()`,
we do not need to call any methods of `Subsystem` at all.
> 
> haosdent huang wrote:
>     Oh, not. `/sys/fs/cgroup/<subsystem>/mesos` is the root hierachy. Suppose we
mount both `cpu` and `cpuacct` into `/sys/fs/cgroup/cpu,cpuacct/mesos`.
>     If we delegate the create operation to `Subsystem`, then we would create `/sys/fs/cgroup/cpu,cpuacct/mesos/${container_id}`
when `CpuSubsystem::prepare()` and
>     `CpuacctSubsystem::prepare()`. This would bring problems.
> 
> Qian Zhang wrote:
>     Understood. Then I think we should not **always** create these two subsystems: `CpuSubsystem`
and `CpuacctSubsystem`, instead, we should only have one CPU subsystem class `CpuSubsystem`
which handles both `cpu` and `cpuacct` subsystems, just like what the original `cgroups/cpu`
isolator does. And in the `Subsystem` class, we should have a vector field `subsystems`, for
`MemorySubsystem`/`NetClsSubsystem`/`PerfEventSubsystem`, this vector has only 1 element (i.e.,
`memory`/`net_cls`/`perf_event`), but for `CpuSubsystem`, if cpu and cpuacct are co-mounted
at `/sys/fs/cgroup/cpu,cpuacct`, its `subsystems` vector will have 1 elements, and if cpu
and cpuacct are mounted separately, it will have 2 elements. And then in `Subsystem::prepare()`,
it should go through each element of `subsystems` vector in a `for` loop, and create cgroup
accordingly.
>     
>     And in the patch https://reviews.apache.org/r/45086/ , I am thinking maybe we should
not expose `cgroups/cpuacct` in the flag `--isolation`, how `cpu` and `cpuacct` subsystems
are handled should be an internal thing which should be inside of `CpuSubsystem`. That means
for user, they should always specify `cgroups/cpu` as before and should not be aware of `cgroups/cpuacct`,
and inside of `CpuSubsystem`, we can create 1 subsystem or 2 subsystems based on whether cpu
and cpuacct are co-mounted or separately mounted. This should also be a good way to keep the
backward compatibility since there is no any changes to the existing user experience.

Hi, @qianzhang We could not do this:

>we should only have one CPU subsystem class CpuSubsystem which handles both cpu and cpuacct
subsystems

Because the main purpose of this ticket is try to resolve this problem. We manage two different
subsystems in `CgroupsCpushareIsolatorProcess` is because we could not manage co-mounted subsystems
in different isolators before. And this bring some tricky problems. So we try to solve this
problem and create this epic. As you see, not only `cpu` and `cpuacct` could co-mounted under
same hierarchy, we could mount `mem`, `perf_event` on `/sys/fs/cgroup/cpu,cpuacct` as well.
And in cgroup-v2, all subsystems would be co-mounted under only one hierarchy, so we try to
refactor all current cgroups isolators into one unified isolator. And make it fits the co-mounted
subsystems case better.


- haosdent


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46158/#review137145
-----------------------------------------------------------


On June 19, 2016, 3:55 p.m., haosdent huang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46158/
> -----------------------------------------------------------
> 
> (Updated June 19, 2016, 3:55 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Guangya Liu, Ian Downes, Jie Yu, Kevin Klues,
and Qian Zhang.
> 
> 
> Bugs: MESOS-5041
>     https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Completed implementation of the cgroups unified isolator.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46158/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> haosdent huang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message