mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Rukletsov" <ruklet...@gmail.com>
Subject Re: Review Request 39401: Quota: Updated allocate() in the hierarchical allocator to support quota.
Date Sat, 14 Nov 2015 15:50:04 GMT


> On Oct. 26, 2015, 1:49 p.m., Qian Zhang wrote:
> > For this patch, it seems that we add the code related to quota support in the slave
foreach loop in the HierarchicalAllocatorProcess::allocate(const hashset<SlaveID>&
slaveIds_) method, so that means for **each slave**, we handle quota first and then the existing
DRF fair share. I think there might be an issue for this approach: let say for the first slave,
its available unreserved non-revocable resources can not satisfy a role’s quota due to the
framework in this role has a filter for this slave, and then we lay aside the filtered resources
of this slave for this role immediately. I think it might be too early for doing this since
the other slaves may have resources which can satisfy this role’s quota. But if we lay aside
this slave's resource for this role at this point, then the result is the framework of this
role will not use these resources (due to the filter) AND all other role’s frameworks can
not be offered with these resources too, this is kind of wasting resource
 s.
> > 
> > I think maybe we can handle this quota support in this way: In HierarchicalAllocatorProcess::allocate(const
hashset<SlaveID>& slaveIds_), leave the existing 3 levels foreach loops (slave/role/framework)
as they are, and add the quota related code separately before them in this way: traverse all
quota’ed roles, for each of them, traverse all the slaves, and allocate each slave’s available
unreserved non-revocable resources to the role’s framework (take filter and suppress into
account) until the role’s quota is satisfied. After all the quota’ed role has been traversed,
if there are still some role’s quotas are not satisfied, then lay aside resources (should
be the resources filtered or suppressed) for them. In this way, before laying aside resources,
we have tried our best to use all slave's the available resources to satisfy the quotas first,
there should be less resources wasted.
> 
> Alexander Rukletsov wrote:
>     I'm not sure I got your point. If my mental compiler is correct, if a framework in
quota'ed role opts out, we do not immediately lay aside resources. We do that after we have
checked all the frameworks in the role in a separate loop.
> 
> Qian Zhang wrote:
>     Let me clarify my point with an example:
>     Say in the Mesos cluster, there are 2 agents, a1 and a2, each has 4GB memory. And
there are 2 roles, r1 and r2, r1 has a quota set (4GB) and r2 has not quota set. r1 has a
framework f1 which currenlty has no allocation but has a filter (4GB memory on a1), r2 also
has a framework f2 which currently has no allocation too and has no filter. And there is no
static/dynamic reservation and no revocable resources. Now with the logic in this patch, for
a1, in the quotaRoleSorter foreach loop, when we handle the quota for r1, we will not allocate
a1's resouces to f1 because f1 has a filter, so a1's 4GB memory will be laid aside to satisfy
r1's quota. And then in the roleSorter foreach loop, we will NOT allocate a1's resources to
f2 too since currently a1 has no available resources due to all its 4GB memory has been laid
aside for r1. And then when we handle a2, its 4GB memory will be allocated to f1, so f2 will
not get anything in the end. So the result is, a1's 4GB memory is laid aside
  to satisfy r1's quota, a2's 4GB is allocated to f1, and f2 gets nothing. But I think for
this example, the expected result should be, f1 gets a2's 4GB (r1's quota is also satisfied)
and f2 gets a1's 4GB.
> 
> Alexander Rukletsov wrote:
>     This can happen during the allocation cycle, but we do not persist laid aside resources
between allocation cycles. Without refactoring `allocate()` we do not know whether we get
a suitable agent, hence we have to lay aside. But at the next allocation cycle, `r1`'s quota
is satisfied and `f2` will get `a1`'s 4GB, which is OK in my opinion.
> 
> Qian Zhang wrote:
>     Yes, I understand ```f2``` will get 4GB at the ***next*** allocation cycle. But with
the proposal in my first post, in a ***single*** allocation cycle, both ```f1``` and ```f2```
can get 4GB respectively because when we find we can not allocate ```a1```'s 4GB to f1 due
to the filter, we will NOT lay aside resources at this point, instead we will try ```a2```
and allocate ```a2```'s 4GB to f1, and then when we handle the fair share, we will allocate
```a2```'s 4GB to ```f2```. I think this proposal is also aligned with the principle mentioned
in the design doc: quota first, fair share second. My understanding to this design principle
is, we handle all role's quota for all slaves first, and then handle all role's fair share
for all slaves (my proposal), rather than for ***each slave*** handle all role's quota and
then all role's fair share (this patch).
> 
> Alexander Rukletsov wrote:
>     Design doc addresses the semantics, not the actual implementation. An important thing
is to ensure that no allocationd for non quota'ed roles are made unless there exist unsatisified
quotas. I think the difference between two approaches is not that big. I tend to agree that
mine can be slightly more confusing, but I would argue it's less intrusive (especially given
we plan an allocator refactoring).

We had an offline discussion with Joris and BenM regarding your proposal and agreed it has
less impact on non-quota'ed frameworks. I will update the request soon.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39401/#review104012
-----------------------------------------------------------


On Nov. 9, 2015, 10:01 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39401/
> -----------------------------------------------------------
> 
> (Updated Nov. 9, 2015, 10:01 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Joerg Schad, Joris Van Remoortere, and Joseph
Wu.
> 
> 
> Bugs: MESOS-3718
>     https://issues.apache.org/jira/browse/MESOS-3718
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp 14fef63714721fcda7cea3c28704766efda6d007

> 
> Diff: https://reviews.apache.org/r/39401/diff/
> 
> 
> Testing
> -------
> 
> make check (Mac OS X 10.10.4)
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message