mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <bmah...@apache.org>
Subject Re: Review Request 71646: Optimized tracking of cluster resource totals.
Date Wed, 30 Oct 2019 23:47:32 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71646/#review218468
-----------------------------------------------------------




src/master/allocator/mesos/sorter/drf/sorter.cpp
Line 464 (original), 455-456 (patched)
<https://reviews.apache.org/r/71646/#comment306214>

    Whitespace is off here?


- Benjamin Mahler


On Oct. 29, 2019, 2:56 p.m., Andrei Sekretenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71646/
> -----------------------------------------------------------
> 
> (Updated Oct. 29, 2019, 2:56 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-10015
>     https://issues.apache.org/jira/browse/MESOS-10015
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch addresses poor performance of
> `HierarchicalAllocatorProcess::updateAllocation()` for agents with
> a huge number of non-addable resources in a many-framework case
> (see MESOS-10015).
> 
> Sorter methods for totals tracking that modify `Resources` of an agent
> in the Sorter are replaced with methods that add/remove resource
> quantities of an agent as a whole (which was actually the only use case
> of the old methods). Thus, subtracting/adding `Resources` of a whole
> agent no longer occurs when updating resources of an agent in a Sorter.
> 
> Further, this patch completely removes agent resource tracking logic
> from the random sorter (which by itself makes no use of them) by
> implementing cluster totals tracking in the allocator.
> 
> Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*`
> (for the DRF sorter):
> 
> Master:
> Agent resources size: 200 (50 frameworks)
> Made 20 reserve and unreserve operations in 2.08586secs
> Agent resources size: 400 (100 frameworks)
> Made 20 reserve and unreserve operations in 13.8449005secs
> Agent resources size: 800 (200 frameworks)
> Made 20 reserve and unreserve operations in 2.19253121188333mins
> 
> Master + this patch:
> Agent resources size: 200 (50 frameworks)
> Made 20 reserve and unreserve operations in 468.482366ms
> Agent resources size: 400 (100 frameworks)
> Made 20 reserve and unreserve operations in 925.725947ms
> Agent resources size: 800 (200 frameworks)
> Made 20 reserve and unreserve operations in 2.110337109secs
> ...
> Agent resources size: 6400 (1600 frameworks)
> Made 20 reserve and unreserve operations in 1.50141861756667mins
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.hpp 9d0fbe771868ea60e66b9e25b0c666d5416d6e85

>   src/master/allocator/mesos/hierarchical.cpp 21010de363f25c516bb031e4ae48888e53621128

>   src/master/allocator/mesos/sorter/drf/sorter.hpp 3f6c7413f1b76f3fa86388360983763c8b76079f

>   src/master/allocator/mesos/sorter/drf/sorter.cpp ef79083b710fba628b4a7e93f903883899f8a71b

>   src/master/allocator/mesos/sorter/random/sorter.hpp a3097be98d175d2b47714eb8b70b1ce8c5c2bba8

>   src/master/allocator/mesos/sorter/random/sorter.cpp 86aeb1b8136eaffd2d52d3b603636b01383a9024

>   src/master/allocator/mesos/sorter/sorter.hpp 6b6b4a1811ba36e0212de17b9a6e63a6f8678a7f

>   src/tests/sorter_tests.cpp d7fdee8f2cab4c930230750f0bd1a55eb08f89bb 
> 
> 
> Diff: https://reviews.apache.org/r/71646/diff/4/
> 
> 
> Testing
> -------
> 
> **make check**
> 
> **Variant of `ReservationParam/HierarchicalAllocator__BENCHMARK_WithReservationParam`**
> from https://reviews.apache.org/r/71639/ (work in progress) 
> shows significant improvement and change from O(number_of_roles^3) to O(number_of_roles^2):
> **Before**:
> Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.08586secs
> Average UNRESERVE duration: 51.491561ms
> Average RESERVE duration: 52.801438ms
> 
> Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 13.8449005secs
> Average UNRESERVE duration: 347.624639ms
> Average RESERVE duration: 344.620385ms
> 
> Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.19253121188333mins
> Average UNRESERVE duration: 3.285422441secs
> Average RESERVE duration: 3.292171194secs
> 
> Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
> (killed after several minutes)
> 
> **After:**
> Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 468.482366ms
> Average UNRESERVE duration: 10.979921ms
> Average RESERVE duration: 12.444196ms
> 
> Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 925.725947ms
> Average UNRESERVE duration: 23.377155ms
> Average RESERVE duration: 22.909141ms
> 
> Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.110337109secs
> Average UNRESERVE duration: 52.53835ms
> Average RESERVE duration: 52.978505ms
> 
> Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 6.524451736secs
> Average UNRESERVE duration: 162.464708ms
> Average RESERVE duration: 163.757877ms
> 
> Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 24.696928676secs
> Average UNRESERVE duration: 609.666416ms
> Average RESERVE duration: 625.180017ms
> 
> Agent resources size: 6400 (1600 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 1.50141861756667mins
> Average UNRESERVE duration: 2.269904993secs
> Average RESERVE duration: 2.234350859secs
> 
> **No significant performnce changes in `QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota`.**
> 
> **Before:**
> 
> Added 30 agents in 1.175593ms
> Added 30 frameworks in 6.829173ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
> Made 36 allocations in 8.294832ms
> Made 0 allocation in 3.674923ms
> 
> Added 300 agents in 7.860046ms
> Added 300 frameworks in 149.743858ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
> Made 350 allocations in 132.796102ms
> Made 0 allocation in 107.887758ms
> 
> Added 3000 agents in 36.944587ms
> Added 3000 frameworks in 10.688501403secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 12.6020582secs
> Made 0 allocation in 9.716229696secs
> 
> Added 30 agents in 1.010362ms
> Added 30 frameworks in 6.272027ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
> Made 38 allocations in 9.119976ms
> Made 0 allocation in 5.460369ms
> 
> Added 300 agents in 7.442897ms
> Added 300 frameworks in 152.016597ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
> Made 391 allocations in 195.242282ms
> Made 0 allocation in 139.638551ms
> 
> Added 3000 agents in 36.003028ms
> Added 3000 frameworks in 11.203697649secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 17.807913455secs
> Made 0 allocation in 13.524946653secs
> 
> **After:**
> 
> Added 30 agents in 1.196576ms
> Added 30 frameworks in 6.814792ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
> Made 36 allocations in 8.263036ms
> Made 0 allocation in 3.947283ms
> 
> Added 300 agents in 8.497121ms
> Added 300 frameworks in 156.578165ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
> Made 350 allocations in 168.745307ms
> Made 0 allocation in 95.505069ms
> 
> Added 3000 agents in 38.074525ms
> Added 3000 frameworks in 11.249150205secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 12.772526049secs
> Made 0 allocation in 10.132801781secs
> 
> Added 30 agents in 799844ns
> Added 30 frameworks in 5.8663ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
> Made 38 allocations in 9.612524ms
> Made 0 allocation in 5.150924ms
> 
> Added 300 agents in 5.560583ms
> Added 300 frameworks in 138.469712ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
> Made 391 allocations in 175.021255ms
> Made 0 allocation in 138.181869ms
> 
> Added 3000 agents in 42.921689ms
> Added 3000 frameworks in 10.825018278secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 15.29232742secs
> Made 0 allocation in 14.202057473secs
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message