mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chun-Hung Hsiao <chhs...@apache.org>
Subject Re: Review Request 66050: Implemented grow and shrink of persistent volumes.
Date Tue, 27 Mar 2018 00:37:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66050/#review199985
-----------------------------------------------------------




src/common/resources_utils.cpp
Lines 202 (patched)
<https://reviews.apache.org/r/66050/#comment280495>

    Let's do the following check to make it explicit that this function does not support RP
resource operations:
    ```
    if (TResources::hasResourceProvider(volume)) {
      return Error("Operation not supported for resource provider");
    }
    ```



src/common/resources_utils.cpp
Lines 203-225 (patched)
<https://reviews.apache.org/r/66050/#comment280494>

    I don't think this is correct. The consumed resources should remain the same so the master
can apply the conversion. Unless I'm missing something important, it should be as simple as:
    ```
    TResource converted = volume;
    *converted.multable_scalar() += addition.scalar(); 
    
    conversions.emplace_back(volume + addition, converted);
    ```



src/common/resources_utils.cpp
Lines 231 (patched)
<https://reviews.apache.org/r/66050/#comment280496>

    Let's do the following check to make it explicit that this function does not support RP
resource operations:
    ```
    if (TResources::hasResourceProvider(volume)) {
      return Error("Operation not supported for resource provider");
    }
    ```



src/common/resources_utils.cpp
Lines 238-240 (patched)
<https://reviews.apache.org/r/66050/#comment280502>

    How about `s/stripped/freed/`? This would make the intention of the variable more clear,
but create a slight inconsistency w.r.t. other operations, so it's up to you.
    
    Also, this can be slightly simplified:
    ```
    freed.mutable_scalar()->CopyFrom(
        operation.shrink_volume().substract());
    ```



src/common/resources_utils.cpp
Lines 254-259 (patched)
<https://reviews.apache.org/r/66050/#comment280501>

    `s/shrinkedVolume/shrinked`
    
    Since `target` is only used once here, I'd suggest that we do the subtraction in place:
    ```
    TResource shrinked = volume;
    *shrinked.mutable_scalar() -= operation.shrink_volume().subtract();
    
    conversions.emplace_back(volume, shrinked + freed);
    ```
    
    Alternatively, we can leverage `TResources::shrink` to have more safety guarantee:
    ```
    TResource shrinked = volume;
    CHECK(TResources::shrink(&shrinked, operation.shrink_volume().subtract());
    
    conversions.emplace_back(volume, shrinked + freed);
    ```
    
    Both options are good to me since we do validations elsewhere.



src/master/validation.cpp
Lines 2460-2513 (patched)
<https://reviews.apache.org/r/66050/#comment280503>

    Let's reorganize the validation process as follows:
    
    First, general validations:
    - `resource::validate(shrink.volume())`
    - `shrink.subtract() > zero`, where `zero` is a zero-valued scalar.
    - `shrink.subtract() < shrink.volume().scalar()`
    Note that I'm levegring the comparisons defined in `include/mesos/values.hpp`, which do
fixed-point conversions before comparing scalars. Consider adding `operator<` and `operator>`.
    
    Second, reject RP resources:
    ```
    if (Resources::hasResourceProvider(shrink.volume())) {
      return Error("Shrinking a volume from a resource provider is not supported");
    }
    ```
    
    Third, validation specific to agent default resources:
    - `resource::validatePersistentVolume(shrink.volume())`
    - `!Resources::isShared(shrink.volume())`
    
    As we discussed earlier, the resource containment check needs not to be done here. Given
this, I think we can remove most of the parameters in this function.



src/slave/slave.cpp
Lines 4292-4324 (original), 4292-4342 (patched)
<https://reviews.apache.org/r/66050/#comment280526>

    Let's do some refactoring here. There are 3 cases:
    
    1. For speculative RP operations, the agent needs to apply them so it can proceed, but
the status update is delegated to RP themselves so no need to generate state updates here.
    2. For non-speculative RP operations, the agent can completely delegate the handling to
RP.
    3. For non-RP operations, the agent needs to apply, checkpoint, and update the statuses.
    
    One option I can think of for refactoring is:
    1. Make `apply` handle `GROW_VOLUME` and `SHRINK_VOLUME` and return an `OperationStatus`.
For speculative operation, do the actual resource change and return an update with an empty
conversion; for non-speculative operation, return one with the actual conversion.
    2. Do the following here:
    ```
    if (resourceProviderId.isSome()) {
      if (protobuf::isSpeculativeOperation(operation.info())) {
        apply(operation);
      }
      return resourceProviderManager.applyOperation(message);
    }
    
    const OperationStatus status = apply(operation);
    
    ... // The checkpointing logic.
    
    UpdateOperationStatusMessage update =
      protobuf::createUpdateOperationStatusMessage(... status ...);
      
    updateOperation(operation, update);
    ```
    Other thoughts on refactoring/improving the readability are also welcome.
    
    Also, the current agent code doesn't checkpoint operations, which might cause some reconciliation
problem. But I think we can fix this later, just need to think through the potential problems
first.



src/slave/slave.cpp
Lines 4328 (patched)
<https://reviews.apache.org/r/66050/#comment280509>

    I think `message` is only used for passing error messages, so we should use `None()` here.



src/slave/slave.cpp
Lines 4331 (patched)
<https://reviews.apache.org/r/66050/#comment280507>

    This should be removed. The operation is already added above in line 4920.



src/slave/slave.cpp
Lines 8053-8059 (patched)
<https://reviews.apache.org/r/66050/#comment280595>

    If we checkpoint resources after sending status updates, then say if the agent crashes
before checkpointing, the master would have assumed that the operation is completed but the
agent will recover to its old states. So let's checkpoint resources before sending status
updates, as I suggested above.



src/tests/mesos.hpp
Lines 2991-2994 (patched)
<https://reviews.apache.org/r/66050/#comment280506>

    Add a TODO for implementing a default behavior for the mock RP.



src/tests/persistent_volume_tests.cpp
Lines 208 (patched)
<https://reviews.apache.org/r/66050/#comment280504>

    `s/These two will not unreachable/Temporarily disable these two operations for now./`



src/tests/reservation_tests.cpp
Lines 126 (patched)
<https://reviews.apache.org/r/66050/#comment280505>

    Remove this blank line, or add a TODO if you plan to enable the two operations later in
the chain.


- Chun-Hung Hsiao


On March 26, 2018, 4:53 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66050/
> -----------------------------------------------------------
> 
> (Updated March 26, 2018, 4:53 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gaston Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-4965
>     https://issues.apache.org/jira/browse/MESOS-4965
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The new offer operations are implemented as non-speculative operations,
> which means they cannot be chained with previous offer operations which
> depend on results of each other.
> 
> 
> Diffs
> -----
> 
>   src/common/protobuf_utils.cpp 2acdc798200e82d14651b1edcc5f83e174477d53 
>   src/common/resources_utils.cpp 9be01c1abd48264e308960f35cc7c2ee8a367518 
>   src/master/master.cpp 390fbe9fe71981101ed2bd9a1f1b21b0e3727b7b 
>   src/master/validation.hpp 7c129ceb929596acbb64d37025e055661277e6bf 
>   src/master/validation.cpp 9254cd22e67d9aca53135701575fb0857ff6f7d6 
>   src/slave/slave.cpp e76daabad0d2d68aa42d1da809d4a23459eaaacb 
>   src/tests/mesos.hpp 2c3d0c9350bbbbcd2223ff20c0797d1849d38c19 
>   src/tests/persistent_volume_tests.cpp 924d8458e54e34a49c99593482b5908c5f7c7a48 
>   src/tests/reservation_tests.cpp 8d8e9c8390e65187269bd194bb322bbdff88e0bd 
> 
> 
> Diff: https://reviews.apache.org/r/66050/diff/5/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message