hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Understanding task commit/abort protocol
Date Wed, 08 Feb 2017 14:52:51 GMT

> On 3 Feb 2017, at 20:02, Chris Douglas <chris.douglas@gmail.com> wrote:
> 
> It's been a long time, but IIRC this isn't going to be invoked. The AM
> will never set the preempt flag in the umbilical, so the task will
> never transition to this state.
> 
> MapReduce checkpoint/restart of reduce tasks was going to be part of
> MAPREDUCE-5269, which signals a ReduceTask to promote its partial
> output if both the Reducer and OutputCommitter are tagged as
> @Checkpointable. If either is not, then the flag is never set. The
> code that would have implemented this was not committed, so it's
> really-really not going to be set. -C

I didn't think it was being used, but thanks for clarifying this.

Should that code snippet be culled? Or at least the abort operation to actually call abortTask?

> 
> On Fri, Feb 3, 2017 at 6:41 AM, Steve Loughran <stevel@hortonworks.com> wrote:
>> 
>> In HADOOP-13786 I'm adding a new committer, one which writes to S3 without doing
renames. It does this by submitting all the data to S3 targeted at the final destination,
but doesn't send the POST needed to materialize it until the tasks commits. Abort the task
and it cancels these pending commits.
>> 
>> this algorithm should be robust provided that only one attempt for a task is committed,
which comes down to
>> 
>> 1.  Only those tasks which have succeeded are committed
>> 2   those tasks which have not succeeded have their pending writes aborted
>> 
>> 
>> Which is where I now have a question. In the class org.apache.hadoop.mapred.Task,
OutputCommitter.commitTask() is called when a task is pre-empted:
>> 
>> 
>>  public void done(TaskUmbilicalProtocol umbilical,
>>                   TaskReporter reporter
>>                   ) throws IOException, InterruptedException {
>>    updateCounters();
>>    if (taskStatus.getRunState() == TaskStatus.State.PREEMPTED ) {
>>      // If we are preempted, do no output promotion; signal done and exit
>>      committer.commitTask(taskContext);         / * HERE */
>>      umbilical.preempted(taskId, taskStatus);
>>      taskDone.set(true);
>>      reporter.stopCommunicationThread();
>>      return;
>>    }
>> 
>> That's despite the line above saying "do no output promotion", and, judging by its
place in the code, looking like it's the handler for task preempted state.
>> 
>> Shouldn't it be doing a task abort here?
>> 
>> I suspect the sole reason this hasn't shown up as a problem before is that this is
the sole use of TaskStatus.State.PREEMPTED in the hadoop code: this particular codepath is
never executed. In which case, culling it may be correct option.
>> 
>> Thoughts?
>> 
>> -Steve
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org


Mime
View raw message