spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jisoo Kim (JIRA)" <>
Subject [jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes
Date Thu, 23 Feb 2017 23:11:44 GMT


Jisoo Kim commented on SPARK-19698:

[~kayousterhout] Thanks for linking the JIRA ticket, I agree that the ticket describes a very
similar problem that I had. However, I don't think that fixes the problem because the PR only
deals with a problem in ShuffleMapStage and doesn't check the attempt Id in case of ResultStage.
In my case, it was ResultStage that had the problem. I had run my test with a fix from (
but it still failed. 

Could you point me to where driver will wait until all tasks finish? I tried finding the part
but wasn't successful. I don't think Driver shuts down all tasks when a job is done, however,
DAGScheduler signals the JobWaiter every time it receives completion event for a task that
is responsible for unfinished partition (
As a result, JobWaiter will call success() on the job promise (
before the 2nd task attempt finishes. This could not be of a problem if Driver waits until
all tasks finish and SparkContext doesn't return results before all tasks finish, but I haven't
found that it does yet (please correct me if I am missing something). I call SparkContext.stop()
after I get the result from the application to clean up and upload event logs so I can view
the spark history from the history server. And when SparkContext stops, AFAIK, it stops the
Driver as well, which will shut down the task scheduler and executors, and I don't think executors
will wait until it finishes its task before it shuts down. Hence, if this happens, the 2nd
task attempt will get shut down as well I think.

> Race condition in stale attempt task completion vs current attempt task completion when
task is doing persistent state changes
> ------------------------------------------------------------------------------------------------------------------------------
>                 Key: SPARK-19698
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos, Spark Core
>    Affects Versions: 2.0.0
>            Reporter: Charles Allen
> We have encountered a strange scenario in our production environment. Below is the best
guess we have right now as to what's going on.
> Potentially, the final stage of a job has a failure in one of the tasks (such as OOME
on the executor) which can cause tasks for that stage to be relaunched in a second attempt.
> keeps track of which tasks have been completed, but does NOT keep track of which attempt
those tasks were completed in. As such, we have encountered a scenario where a particular
task gets executed twice in different stage attempts, and the DAGScheduler does not consider
if the second attempt is still running. This means if the first task attempt succeeded, the
second attempt can be cancelled part-way through its run cycle if all other tasks (including
the prior failed) are completed successfully.
> What this means is that if a task is manipulating some state somewhere (for example:
a upload-to-temporary-file-location, then delete-then-move on an underlying s3n storage implementation)
the driver can improperly shutdown the running (2nd attempt) task between state manipulations,
leaving the persistent state in a bad state since the 2nd attempt never got to complete its
manipulations, and was terminated prematurely at some arbitrary point in its state change
logic (ex: finished the delete but not the move).
> This is using the mesos coarse grained executor. It is unclear if this behavior is limited
to the mesos coarse grained executor or not.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message