spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kay Ousterhout (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (SPARK-19560) Improve tests for when DAGScheduler learns of "successful" ShuffleMapTask from a failed executor
Date Fri, 24 Feb 2017 19:44:44 GMT

     [ https://issues.apache.org/jira/browse/SPARK-19560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kay Ousterhout closed SPARK-19560.
----------------------------------
          Resolution: Fixed
    Target Version/s: 2.2.0

> Improve tests for when DAGScheduler learns of "successful" ShuffleMapTask from a failed
executor
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19560
>                 URL: https://issues.apache.org/jira/browse/SPARK-19560
>             Project: Spark
>          Issue Type: Test
>          Components: Scheduler
>    Affects Versions: 2.1.1
>            Reporter: Kay Ousterhout
>            Assignee: Kay Ousterhout
>            Priority: Minor
>
> There's some tricky code around the case when the DAGScheduler learns of a ShuffleMapTask
that completed successfully, but ran on an executor that failed sometime after the task was
launched.  This case is tricky because the TaskSetManager (i.e., the lower level scheduler)
thinks the task completed successfully, but the DAGScheduler considers the output it generated
to be no longer valid (because it was probably lost when the executor was lost).  As a result,
the DAGScheduler needs to re-submit the stage, so that the task can be re-run.  This is tested
in some of the tests but not clearly documented, so we should improve this to prevent future
bugs (this was encountered by [~markhamstra] in attempting to find a better fix for SPARK-19263).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message