ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Re: Review Request 40139: SKIPPED_FAILED state should not be bubbled up to the Upgrade level
Date Tue, 10 Nov 2015 19:31:40 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40139/#review105932
-----------------------------------------------------------



ambari-server/src/main/java/org/apache/ambari/server/controller/internal/CalculatedStatus.java
(lines 352 - 361)
<https://reviews.apache.org/r/40139/#comment164631>

    So my main issue here is that for someone not completely familiar with this issue, they
could easily use the other method which would produce a status of SKIPPED_FAILED.
    
    It's just too easy to pick the wrong method here when writing dependent code.
    
    With that said, why are we even bubbling this up? This should behave as any other failure:
    
    counters.get(HostRoleStatus.FAILED) > 0 && !skippable ? HostRoleStatus.FAILED
:
    
    - If the command fails and it wasn't skippable, then the stage and request are FAILED
    - If the command fails and it was skippable, then the stage and request are COMPLETED
but the command itself still failed
    
    Just because it was "auto skipped" doesn't mean it should display differently.


- Jonathan Hurley


On Nov. 10, 2015, 12:55 p.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40139/
> -----------------------------------------------------------
> 
> (Updated Nov. 10, 2015, 12:55 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, Jayush Luniya, Nate
Cole, and Yusaku Sako.
> 
> 
> Bugs: AMBARI-13818
>     https://issues.apache.org/jira/browse/AMBARI-13818
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When there is a skipped failure, the "upgrade" state itself comes SKIPPED_FAILED. Even
when the upgrade is running or paused, it is returning "SKIPPED_FAILED". The API should not
roll this up to the "upgrade" level as the current behavior is confusing. At the top level,
it should just be HOLDING, IN_PROGRESS, COMPLETED, etc. SKIPPED_FAILED should be bubbled up
to the upgrade group level and stop there.
> 
> 
> Also fixes another blocker:
> STR:
> 1) Install and deploy cluster with older HDP version
> 2) Enable NameNode HA
> 3) Register, install new HDP version
> 4) Start Rolling Upgrade with "Skip all Service Check failures" and "Skip all Slave Component
failures" options
> 5) Break datanode_upgrade.py script and wait for Core Slaves failures
> 6) Click "Pause upgrade" on "Core Slaves - >Verifying Skipped Failures" step
> Result:
> Button "Resume upgrade" doesn't work. After clicking on this button I've got next http
response
> {
>   "status" : 400,
>   "message" : "java.lang.IllegalArgumentException: Can only set status to PENDING when
the upgrade is ABORTED (currently SKIPPED_FAILED)"
> }
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/CalculatedStatus.java
f87c32c 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/CalculatedStatusTest.java
4b8587f 
> 
> Diff: https://reviews.apache.org/r/40139/diff/
> 
> 
> Testing
> -------
> 
> checked on live cluster
> 
> mvn clean test in progress
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message