hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sushanta Sen (Jira)" <j...@apache.org>
Subject [jira] [Created] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for guaranteed containers which is not correct to fail an application
Date Thu, 04 Mar 2021 14:15:00 GMT
Sushanta Sen created YARN-10670:
-----------------------------------

             Summary: YARN: Opportunistic Container : : In distributed shell job if containers
are killed then application is failed. But in this case as containers are killed to make room
for guaranteed containers which is not correct to fail an application
                 Key: YARN-10670
                 URL: https://issues.apache.org/jira/browse/YARN-10670
             Project: Hadoop YARN
          Issue Type: Bug
          Components: distributed-shell
    Affects Versions: 3.1.1
            Reporter: Sushanta Sen


Preconditions:
 # Secure Hadoop 3.1.1 c3 Nodes cluster is installed
 # Set the below parameters  in RM::<property>
 <name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name>
 <value>true</value>
 </property>
 # Set this in NM[s]: <property>
 <name>yarn.nodemanager.opportunistic-containers-max-queue-length</name>
 <value>30</value>
 </property>

 
Test Steps:


Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
-shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message

{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed = 20, allocated
= 20, failed = 1, diagnostics = [2021-02-09 22:11:48.440]Container De-queued to meet NM queuing
limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Mime
View raw message