hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandni Singh (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-8360) Yarn service conflict between restart policy and NM configuration
Date Thu, 24 May 2018 20:59:00 GMT
Chandni Singh created YARN-8360:

             Summary: Yarn service conflict between restart policy and NM configuration 
                 Key: YARN-8360
                 URL: https://issues.apache.org/jira/browse/YARN-8360
             Project: Hadoop YARN
          Issue Type: Bug
          Components: yarn
            Reporter: Chandni Singh

For the below spec, the service will not stop even after container failures because of the
NM auto retry properties :
 * "yarn.service.container-failure.retry.max": 1,
 * "yarn.service.container-failure.validity-interval-ms": 5000
 The NM will continue auto-restarting containers.
 {{fail_after 20}} fails after 20 seconds. Since the validity failure interval is 5 seconds,
NM will auto restart the container.

  "name": "fail-demo2",
  "version": "1.0.0",
  "components" :
      "name": "comp1",
      "number_of_containers": 1,
      "launch_command": "fail_after 20",
      "restart_policy": "NEVER",
      "resource": {
        "cpus": 1,
        "memory": "256"
      "configuration": {
        "properties": {
          "yarn.service.container-failure.retry.max": 1,
          "yarn.service.container-failure.validity-interval-ms": 5000
If {{restart_policy}} is NEVER, then the service should stop after the container fails.

Since we have introduced, the service level Restart Policies, I think we should make the NM
auto retry configurations part of the {{RetryPolicy}} and get rid of all {{yarn.service.container-failure.**}}
properties. Otherwise it gets confusing.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

View raw message