spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheng Lian (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-975) Spark Replay Debugger
Date Tue, 29 Apr 2014 02:39:14 GMT

    [ https://issues.apache.org/jira/browse/SPARK-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983905#comment-13983905
] 

Cheng Lian commented on SPARK-975:
----------------------------------

Hi [~sarutak], thanks for caring about this. Sorry that this issue hasn't been updated for
a while. At the time SRD was developed, related interfaces exposed by Spark and used in SRD
were not well chosen and exposed some implementation details to API users, so SRD was not
merged yet. We do have plan to improve Spark debugging facilities. Before we settle on a final
design, I would like to rebase the SRD branch to the current master so that people can use
it to debug and analyze their applications, though I can't promise anything for now.

> Spark Replay Debugger
> ---------------------
>
>                 Key: SPARK-975
>                 URL: https://issues.apache.org/jira/browse/SPARK-975
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 0.9.0
>            Reporter: Cheng Lian
>              Labels: arthur, debugger
>
> The Spark debugger was first mentioned as {{rddbg}} in the [RDD technical report|http://www.cs.berkeley.edu/~matei/papers/2011/tr_spark.pdf].
> [Arthur|https://github.com/mesos/spark/tree/arthur], authored by [Ankur Dave|https://github.com/ankurdave],
is an old implementation of the Spark debugger, which demonstrated both the elegance and power
behind the RDD abstraction.  Unfortunately, the corresponding GitHub branch was not merged
into the master branch and had stopped 2 years ago.  For more information about Arthur, please
refer to [the Spark Debugger Wiki page|https://github.com/mesos/spark/wiki/Spark-Debugger]
in the old GitHub repository.
> As a useful tool for Spark application debugging and analysis, it would be nice to have
a complete Spark debugger.  In [PR-224|https://github.com/apache/incubator-spark/pull/224],
I propose a new implementation of the Spark debugger, the Spark Replay Debugger (SRD).
> [PR-224|https://github.com/apache/incubator-spark/pull/224] is only a preview for discussion.
 In the current version, I only implemented features that can illustrate the basic mechanisms.
 There are still features appeared in Arthur but missing in SRD, such as checksum based nondeterminsm
detection and single task debugging with conventional debugger (like {{jdb}}).  However, these
features can be easily built upon current SRD framework.  To minimize code review effort,
I didn't include them into the current version intentionally.
> Attached is the visualization of the MLlib ALS application (with 1 iteration) generated
by SRD.  For more information, please refer to [the SRD overview document|http://spark-replay-debugger-overview.readthedocs.org/en/latest/].



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message