mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 65482: Improved handling of non-terminal operations after master failover.
Date Fri, 16 Feb 2018 14:12:37 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65482/
-----------------------------------------------------------

(Updated Feb. 16, 2018, 3:12 p.m.)


Review request for mesos, Greg Mann, Jie Yu, and Jan Schlicht.


Changes
-------

Fixed comment issues pointed out by Greg; improved logging.


Bugs: MESOS-8536
    https://issues.apache.org/jira/browse/MESOS-8536


Repository: mesos


Description
-------

This patch fixes the handling of non-terminal operations learned by a
newly elected master after a master failover, so that only these
operations are counted as using resources. Previously we did not count
any operations as using resources which by accident produced expected
behavior if the operation was already terminal when the master learned
about them.

We do not address the issue of being unable to properly account for
operations triggered by frameworks unknown to the master, see
MESOS-8582. Instead we emit a warning for now since the master might
continue to abort due to assertion failures due to incomplete resource
accounting.


Diffs (updated)
-----

  src/master/master.cpp b06d7a6e2fbbb81b97eaf537d5b6745c73dc867d 


Diff: https://reviews.apache.org/r/65482/diff/4/

Changes: https://reviews.apache.org/r/65482/diff/3-4/


Testing
-------

`make check`, also tested with a version of the test added in r/65045 which triggered this
issue.


Thanks,

Benjamin Bannier


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message