tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TEZ-3191) NM container diagnostics for excess resource usage can be lost if task fails while being killed
Date Wed, 30 Mar 2016 16:43:25 GMT
Jason Lowe created TEZ-3191:
-------------------------------

             Summary: NM container diagnostics for excess resource usage can be lost if task
fails while being killed
                 Key: TEZ-3191
                 URL: https://issues.apache.org/jira/browse/TEZ-3191
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.7.0
            Reporter: Jason Lowe


This is the Tez version of MAPREDUCE-4955.  I saw a misconfigured Tez job report a task attempt
as failed due to a filesystem closed error because the NM killed the container due to excess
memory usage.  Unfortunately the SIGTERM sent by the NM caused the filesystem shutdown hook
to close the filesystems, and that triggered a failure in the main thread.  If the failure
is reported to the AM via the umbilical before the NM container status is received via the
RM then the useful container diagnostics from the NM are lost in the job history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message