hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhihai Xu <zhihaixu2...@gmail.com>
Subject Re: NodeManagers Localization does not work
Date Tue, 12 Jan 2016 16:31:42 GMT
Hi Prabhu,

I saw some similar localization timeout issue. I found the localization
timeout issue is due to HDFS not YARN.
In my case, HDFS-7005 <https://issues.apache.org/jira/browse/HDFS-7005> fixed
the issue. HDFS-7005 <https://issues.apache.org/jira/browse/HDFS-7005> is
only in 2.6 or later release.
The root cause is all public localizer threads stuck on reading file data
from HDFS.
Maybe you can try HDFS-7005 to see whether it can fix your issue.

Regards
zhihai

On Tue, Jan 12, 2016 at 2:41 AM, Prabhu Joseph <prabhujose.gates@gmail.com>
wrote:

> Hi Experts,
>
>    On hadoop-2.5.1, When Localization is failed for a container of a job in
> a NodeManager at
>
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer,
> then the subsequent containers of that job submitted into that NodeManager
> hangs at Localizing state until the task times out.
>
> On hadoop-2.7.0, the above behavior is fixed, by creating another Localizer
> for the job in the NodeManager when the previous container fails at
> Localization.
>
> Can someone share me the YARN JIRA which fixed the above issue in
> hadoop-2.7.0.
>
>
> Thanks,
> Prabhu Joseph
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message