hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1475) Race condition while launching task cleanup attempt.
Date Wed, 10 Feb 2010 08:54:27 GMT
Race condition while launching task cleanup attempt.

                 Key: MAPREDUCE-1475
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1475
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 0.20.1
            Reporter: Amareshwari Sriramadasu

We found a race condition while launching task cleanup attempt on a TaskTracker which would
eat up a slot.

The scenario is the following:
The main attempt is killed by TaskTracker because it was a speculative attempt. Cleanup attempt
is launched on the same tracker. Cleanup attempt occupied the slot and is about to start.
But, there was a pending RPC: done() from earlier attempt in the RPC queue. Before the cleanup
attempt could be launched, TaskTracker processed the rpc from earlier attempt and made the
state of the cleanup attempt as KILLED. Launcher did not launch it because it was already
KILLED. But, the rpc done() failed with NullPointerException because of false state. In summary,
the slot was occupied by the cleanup attempt which could not be launched. And the slot was
never released.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message