hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Douglas <chris.doug...@gmail.com>
Subject Re: Understanding task commit/abort protocol
Date Sat, 04 Feb 2017 01:02:57 GMT
It's been a long time, but IIRC this isn't going to be invoked. The AM
will never set the preempt flag in the umbilical, so the task will
never transition to this state.

MapReduce checkpoint/restart of reduce tasks was going to be part of
MAPREDUCE-5269, which signals a ReduceTask to promote its partial
output if both the Reducer and OutputCommitter are tagged as
@Checkpointable. If either is not, then the flag is never set. The
code that would have implemented this was not committed, so it's
really-really not going to be set. -C

On Fri, Feb 3, 2017 at 6:41 AM, Steve Loughran <stevel@hortonworks.com> wrote:
>
> In HADOOP-13786 I'm adding a new committer, one which writes to S3 without doing renames.
It does this by submitting all the data to S3 targeted at the final destination, but doesn't
send the POST needed to materialize it until the tasks commits. Abort the task and it cancels
these pending commits.
>
> this algorithm should be robust provided that only one attempt for a task is committed,
which comes down to
>
> 1.  Only those tasks which have succeeded are committed
> 2   those tasks which have not succeeded have their pending writes aborted
>
>
> Which is where I now have a question. In the class org.apache.hadoop.mapred.Task, OutputCommitter.commitTask()
is called when a task is pre-empted:
>
>
>   public void done(TaskUmbilicalProtocol umbilical,
>                    TaskReporter reporter
>                    ) throws IOException, InterruptedException {
>     updateCounters();
>     if (taskStatus.getRunState() == TaskStatus.State.PREEMPTED ) {
>       // If we are preempted, do no output promotion; signal done and exit
>       committer.commitTask(taskContext);         / * HERE */
>       umbilical.preempted(taskId, taskStatus);
>       taskDone.set(true);
>       reporter.stopCommunicationThread();
>       return;
>     }
>
> That's despite the line above saying "do no output promotion", and, judging by its place
in the code, looking like it's the handler for task preempted state.
>
> Shouldn't it be doing a task abort here?
>
> I suspect the sole reason this hasn't shown up as a problem before is that this is the
sole use of TaskStatus.State.PREEMPTED in the hadoop code: this particular codepath is never
executed. In which case, culling it may be correct option.
>
> Thoughts?
>
> -Steve
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org


Mime
View raw message