ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Re: Review Request 40139: SKIPPED_FAILED state should not be bubbled up to the Upgrade level
Date Tue, 10 Nov 2015 19:33:39 GMT


> On Nov. 10, 2015, 2:31 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/CalculatedStatus.java,
lines 352-361
> > <https://reviews.apache.org/r/40139/diff/2/?file=1122084#file1122084line352>
> >
> >     So my main issue here is that for someone not completely familiar with this
issue, they could easily use the other method which would produce a status of SKIPPED_FAILED.
> >     
> >     It's just too easy to pick the wrong method here when writing dependent code.
> >     
> >     With that said, why are we even bubbling this up? This should behave as any
other failure:
> >     
> >     counters.get(HostRoleStatus.FAILED) > 0 && !skippable ? HostRoleStatus.FAILED
:
> >     
> >     - If the command fails and it wasn't skippable, then the stage and request are
FAILED
> >     - If the command fails and it was skippable, then the stage and request are
COMPLETED but the command itself still failed
> >     
> >     Just because it was "auto skipped" doesn't mean it should display differently.

To clarify; auto skipped should leave the command as SKIPPED_FAILED, but the stage and request
should not be - just like how it is for a failure which was skipped by hand.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40139/#review105932
-----------------------------------------------------------


On Nov. 10, 2015, 12:55 p.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40139/
> -----------------------------------------------------------
> 
> (Updated Nov. 10, 2015, 12:55 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, Jayush Luniya, Nate
Cole, and Yusaku Sako.
> 
> 
> Bugs: AMBARI-13818
>     https://issues.apache.org/jira/browse/AMBARI-13818
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When there is a skipped failure, the "upgrade" state itself comes SKIPPED_FAILED. Even
when the upgrade is running or paused, it is returning "SKIPPED_FAILED". The API should not
roll this up to the "upgrade" level as the current behavior is confusing. At the top level,
it should just be HOLDING, IN_PROGRESS, COMPLETED, etc. SKIPPED_FAILED should be bubbled up
to the upgrade group level and stop there.
> 
> 
> Also fixes another blocker:
> STR:
> 1) Install and deploy cluster with older HDP version
> 2) Enable NameNode HA
> 3) Register, install new HDP version
> 4) Start Rolling Upgrade with "Skip all Service Check failures" and "Skip all Slave Component
failures" options
> 5) Break datanode_upgrade.py script and wait for Core Slaves failures
> 6) Click "Pause upgrade" on "Core Slaves - >Verifying Skipped Failures" step
> Result:
> Button "Resume upgrade" doesn't work. After clicking on this button I've got next http
response
> {
>   "status" : 400,
>   "message" : "java.lang.IllegalArgumentException: Can only set status to PENDING when
the upgrade is ABORTED (currently SKIPPED_FAILED)"
> }
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/CalculatedStatus.java
f87c32c 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/CalculatedStatusTest.java
4b8587f 
> 
> Diff: https://reviews.apache.org/r/40139/diff/
> 
> 
> Testing
> -------
> 
> checked on live cluster
> 
> mvn clean test in progress
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message