mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 65482: Improved handling of non-terminal operations after master failover.
Date Fri, 16 Feb 2018 14:13:02 GMT


> On Feb. 3, 2018, 12:06 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 7594-7597 (original), 7609-7612 (patched)
> > <https://reviews.apache.org/r/65482/diff/1/?file=1952241#file1952241line7609>
> >
> >     Is this function now only called with resources from already-existing frameworks?
Does that mean that we can update the code here? https://github.com/apache/mesos/blob/5a1433576eca20f15f1ea309fc202f4bbaf3b6c7/src/master/allocator/mesos/hierarchical.cpp#L690-L703
> 
> Benjamin Bannier wrote:
>     I am not sure we can make sure this will always happen in general. As a precaution
I think it is still a good idea to leave this check in the allocator, if only to document
allocator-local invariants.

Dropping this.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65482/#review196752
-----------------------------------------------------------


On Feb. 16, 2018, 3:12 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65482/
> -----------------------------------------------------------
> 
> (Updated Feb. 16, 2018, 3:12 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-8536
>     https://issues.apache.org/jira/browse/MESOS-8536
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch fixes the handling of non-terminal operations learned by a
> newly elected master after a master failover, so that only these
> operations are counted as using resources. Previously we did not count
> any operations as using resources which by accident produced expected
> behavior if the operation was already terminal when the master learned
> about them.
> 
> We do not address the issue of being unable to properly account for
> operations triggered by frameworks unknown to the master, see
> MESOS-8582. Instead we emit a warning for now since the master might
> continue to abort due to assertion failures due to incomplete resource
> accounting.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp b06d7a6e2fbbb81b97eaf537d5b6745c73dc867d 
> 
> 
> Diff: https://reviews.apache.org/r/65482/diff/4/
> 
> 
> Testing
> -------
> 
> `make check`, also tested with a version of the test added in r/65045 which triggered
this issue.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message