hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-4099) ApplicationMaster may fail to remove staging directory
Date Tue, 03 Apr 2012 14:02:24 GMT
ApplicationMaster may fail to remove staging directory

                 Key: MAPREDUCE-4099
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4099
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2
    Affects Versions: 0.23.2
            Reporter: Jason Lowe
            Priority: Critical

When the ApplicationMaster shuts down it's supposed to remove the staging directory, assuming
properties weren't set to override this behavior. During shutdown the AM tells the ResourceManager
that it has finished before it cleans up the staging directory.  However upon hearing the
AM has finished, the RM turns right around and kills the AM container.  If the AM is too slow,
the AM will be killed before the staging directory is removed.

We're seeing the AM lose this race fairly consistently on our clusters, and the lack of staging
directory cleanup quickly leads to filesystem quota issues for some users.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message