ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez" <afernan...@hortonworks.com>
Subject Re: Review Request 29298: RU: Cannot Retry on failure
Date Mon, 22 Dec 2014 17:17:47 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29298/#review65793
-----------------------------------------------------------



ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java
<https://reviews.apache.org/r/29298/#comment109039>

    There are a lot of places that call the constructor, can it be overloaded with a default
value of false for retryAllowed?


- Alejandro Fernandez


On Dec. 22, 2014, 5:06 a.m., Tom Beerbower wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29298/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2014, 5:06 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez and Nate Cole.
> 
> 
> Bugs: AMBARI-8852
>     https://issues.apache.org/jira/browse/AMBARI-8852
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During RU, a failure occurred on "Client Components" group, "Service Check HBASE, MAPREDUCE2,
HDFS, YARN" item.
> The UI presented me with a Retry button.  However, the server rejected this request:
> 
> PUT /api/v1/clusters/ysru2/upgrades/5/upgrade_groups/4/upgrade_items/30
> {"UpgradeItem":{"status":"PENDING"}}
> 
> {
>   "status" : 400,
>   "message" : "java.lang.IllegalArgumentException: Can not transition a stage from FAILED
to PENDING"
> }
> 
> I believe this is the current expected behavior since the failure is not marked to hold.
 
> However, on any service check failure, the user should be able to retry (or maybe on
any failure?  actions should be idempotent).
> 
> ----
> 
> Allow Retry - mark a stage (upgrade item) to allow any failed task to be retried. This
means that if a failure occurs during the execution of the task then the stage & task
will transition to HOLDING_FAILED. Once in the HOLDING_FAILED state, the stage can be pushed
to PENDING (retry) or FAILED. Transitioning the stage to FAILED will cause the remaining tasks
in that stage to be ABORTED. It never makes sense to allow the remaining tasks of a stage
to continue executing after the stage has been accepted as FAILED. However, the remaining
stages of the upgrade request may be allowed execute...
> 
> Skippable - mark a stage to allow it to be skipped in the event of a failure so that
the remaining stages may still execute. This means that when a stage state is set to FAILED,
it will not trigger the remaining stages of the request to abort.
> By separating the concepts of retry and skippable, we can be more flexible in how we
define the behavior of the upgrade. For example, the core masters upgrade item should be marked
as allow_retry = true and skippable = false. If a failure occurs during this stage you should
be able to retry. If the failure can not be resolved then the entire upgrade request should
be aborted.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java
ccecad9 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java
f71e2d5 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/Stage.java 4922fa5

>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java
17d5782 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java
c8ae61d 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
19ee6d9 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/KerberosHelper.java
fb19bd5 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java
9329ea9 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/HostStackVersionResourceProvider.java
3b1b462 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java
fa39c97 
>   ambari-server/src/main/java/org/apache/ambari/server/utils/StageUtils.java e6e51a1

>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/ExecutionCommandWrapperTest.java
948f137 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionDBAccessorImpl.java
a756275 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionManager.java
01a40f4 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
8ce4ff2 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestStage.java bde19a1

>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java
a6df0db 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/AmbariManagementControllerTest.java
72a22e6 
>   ambari-server/src/test/java/org/apache/ambari/server/serveraction/ServerActionExecutorTest.java
4bd0d18 
>   ambari-server/src/test/java/org/apache/ambari/server/stageplanner/TestStagePlanner.java
dd2a519 
> 
> Diff: https://reviews.apache.org/r/29298/diff/
> 
> 
> Testing
> -------
> 
> Results :
> 
> Tests run: 2447, Failures: 0, Errors: 0, Skipped: 13
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 27:49 min
> [INFO] Finished at: 2014-12-21T23:52:28-05:00
> [INFO] Final Memory: 42M/496M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Tom Beerbower
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message