Fei Wang created SPARK-17644:
--------------------------------
Summary: The failed stage never resubmitted due to abort stage in another thread
Key: SPARK-17644
URL: https://issues.apache.org/jira/browse/SPARK-17644
Project: Spark
Issue Type: Bug
Components: Scheduler, Spark Core
Affects Versions: 2.0.0, 1.6.0
Reporter: Fei Wang
there is a race condition when FetchFailed and resubmit failed stage:
job1, job2 run in different threads, if job 1 failed 4 times due to fetchfailed and aborted,
then job2 can not post ResubmitFailedStages becase the failedStages in DAGScheduler is not
empty now.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org
|