hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyakun (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-9480) createAppDir() in LogAggregationService shouldn't block dispatcher thread of ContainerManagerImpl
Date Sun, 14 Apr 2019 07:59:00 GMT
liyakun created YARN-9480:
-----------------------------

             Summary: createAppDir() in LogAggregationService shouldn't block dispatcher thread
of ContainerManagerImpl
                 Key: YARN-9480
                 URL: https://issues.apache.org/jira/browse/YARN-9480
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: nodemanager
            Reporter: liyakun
            Assignee: liyakun


At present, when startContainers(), if NM does not contain the application, it will enter
the step of INIT_APPLICATION. In the application init step, createAppDir() will be executed,
and it is a blocking operation.

createAppDir() is an operation that needs to interact with an external file system. This operation
is affected by the SLA of the external file system. Once the external file system has a high
latency, the NM dispatcher thread of ContainerManagerImpl will be stuck. (In fact, I have
seen a scene that NM stuck here for more than an hour.)

I think it would be more reasonable to move createAppDir() to the actual time of uploading
log (in other threads). And according to the logRetentionPolicy, many of the containers may
not get to this step, which will save a lot of interactions with external file system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Mime
View raw message