mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Sekretenko <asekrete...@mesosphere.io>
Subject Re: Review Request 71646: Got rid of `Reources::add()/subtract()` for agent resources in Sorter.
Date Mon, 28 Oct 2019 18:45:30 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71646/
-----------------------------------------------------------

(Updated Oct. 28, 2019, 6:45 p.m.)


Review request for mesos, Benjamin Mahler and Meng Zhu.


Changes
-------

Merged https://reviews.apache.org/r/71672 and https://reviews.apache.org/r/71673/ into this
patch.


Summary (updated)
-----------------

Got rid of `Reources::add()/subtract()` for agent resources in Sorter.


Bugs: MESOS-10015
    https://issues.apache.org/jira/browse/MESOS-10015


Repository: mesos


Description (updated)
-------

This patch addresses the issue with poor performance of
`HierarchicalAllocatorProcess::updateAllocation()` for agents with
a huge number of non-addable resources (see MESOS-10015)

Sorter methods that modify `Resources` of an agent in the Sorter
are replaced with methods that add/remove resource quantities of an
agent as a whole (which was actually the only use case of the old
methods). Thus subtracting/adding of `Resources` of the whole agent
no longer occurs when updating resources of the agent in the Sorter.

Further, this patch fully removes tracking agent resources from the
random sorter by implementing tracking of the cluster totals inside
of the allocator.

Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*`
(for the DRF sorter):

Master:
Agent resources size: 200 (50 frameworks)
Made 20 reserve and unreserve operations in 2.08586secs
Agent resources size: 400 (100 frameworks)
Made 20 reserve and unreserve operations in 13.8449005secs
Agent resources size: 800 (200 frameworks)
Made 20 reserve and unreserve operations in 2.19253121188333mins

Master + this patch:
Agent resources size: 200 (50 frameworks)
Made 20 reserve and unreserve operations in 463.223084ms
Agent resources size: 400 (100 frameworks)
Made 20 reserve and unreserve operations in 930.097972ms
Agent resources size: 800 (200 frameworks)
Made 20 reserve and unreserve operations in 2.160506847secs
...
Agent resources size: 6400 (1600 frameworks)
Made 20 reserve and unreserve operations in 1.50729369885mins


Diffs (updated)
-----

  src/master/allocator/mesos/hierarchical.hpp 9d0fbe771868ea60e66b9e25b0c666d5416d6e85 
  src/master/allocator/mesos/hierarchical.cpp 21010de363f25c516bb031e4ae48888e53621128 
  src/master/allocator/mesos/sorter/drf/sorter.hpp 3f6c7413f1b76f3fa86388360983763c8b76079f

  src/master/allocator/mesos/sorter/drf/sorter.cpp ef79083b710fba628b4a7e93f903883899f8a71b

  src/master/allocator/mesos/sorter/random/sorter.hpp a3097be98d175d2b47714eb8b70b1ce8c5c2bba8

  src/master/allocator/mesos/sorter/random/sorter.cpp 86aeb1b8136eaffd2d52d3b603636b01383a9024

  src/master/allocator/mesos/sorter/sorter.hpp 6b6b4a1811ba36e0212de17b9a6e63a6f8678a7f 
  src/tests/sorter_tests.cpp d7fdee8f2cab4c930230750f0bd1a55eb08f89bb 


Diff: https://reviews.apache.org/r/71646/diff/3/

Changes: https://reviews.apache.org/r/71646/diff/2-3/


Testing
-------

**make check**

**Variant of `ReservationParam/HierarchicalAllocator__BENCHMARK_WithReservationParam`**
from https://reviews.apache.org/r/71639/ (work in progress) 
shows significant improvement and change from O(number_of_roles^3) to O(number_of_roles^2):
**Before**:
Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.08586secs
Average UNRESERVE duration: 51.491561ms
Average RESERVE duration: 52.801438ms

Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 13.8449005secs
Average UNRESERVE duration: 347.624639ms
Average RESERVE duration: 344.620385ms

Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.19253121188333mins
Average UNRESERVE duration: 3.285422441secs
Average RESERVE duration: 3.292171194secs

Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
(killed after several minutes)

**After:**
Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 471.781183ms
Average UNRESERVE duration: 12.112223ms
Average RESERVE duration: 11.476835ms

Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 1.022879058secs
Average UNRESERVE duration: 25.53819ms
Average RESERVE duration: 25.605762ms

Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.622324521secs
Average UNRESERVE duration: 65.166039ms
Average RESERVE duration: 65.950186ms

Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 8.419455875secs
Average UNRESERVE duration: 209.886948ms
Average RESERVE duration: 211.085845ms

Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 32.82614382secs
Average UNRESERVE duration: 823.126069ms
Average RESERVE duration: 818.181121ms

Agent resources size: 6400 (1600 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.04261335795mins
Average UNRESERVE duration: 3.063538394secs
Average RESERVE duration: 3.064301679secs

**No significant performnce changes in `QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota`.**

**Before:**

Added 30 agents in 1.175593ms
Added 30 frameworks in 6.829173ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
Made 36 allocations in 8.294832ms
Made 0 allocation in 3.674923ms

Added 300 agents in 7.860046ms
Added 300 frameworks in 149.743858ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
Made 350 allocations in 132.796102ms
Made 0 allocation in 107.887758ms

Added 3000 agents in 36.944587ms
Added 3000 frameworks in 10.688501403secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
Made 3500 allocations in 12.6020582secs
Made 0 allocation in 9.716229696secs

Added 30 agents in 1.010362ms
Added 30 frameworks in 6.272027ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
Made 38 allocations in 9.119976ms
Made 0 allocation in 5.460369ms

Added 300 agents in 7.442897ms
Added 300 frameworks in 152.016597ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
Made 391 allocations in 195.242282ms
Made 0 allocation in 139.638551ms

Added 3000 agents in 36.003028ms
Added 3000 frameworks in 11.203697649secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 17.807913455secs
Made 0 allocation in 13.524946653secs

**After:**

Added 30 agents in 1.196576ms
Added 30 frameworks in 6.814792ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
Made 36 allocations in 8.263036ms
Made 0 allocation in 3.947283ms

Added 300 agents in 8.497121ms
Added 300 frameworks in 156.578165ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
Made 350 allocations in 168.745307ms
Made 0 allocation in 95.505069ms

Added 3000 agents in 38.074525ms
Added 3000 frameworks in 11.249150205secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
Made 3500 allocations in 12.772526049secs
Made 0 allocation in 10.132801781secs

Added 30 agents in 799844ns
Added 30 frameworks in 5.8663ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
Made 38 allocations in 9.612524ms
Made 0 allocation in 5.150924ms

Added 300 agents in 5.560583ms
Added 300 frameworks in 138.469712ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
Made 391 allocations in 175.021255ms
Made 0 allocation in 138.181869ms

Added 3000 agents in 42.921689ms
Added 3000 frameworks in 10.825018278secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 15.29232742secs
Made 0 allocation in 14.202057473secs


Thanks,

Andrei Sekretenko


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message