spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-21962) Distributed Tracing in Spark
Date Sat, 14 Apr 2018 00:15:01 GMT

     [ https://issues.apache.org/jira/browse/SPARK-21962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-21962:
------------------------------------

    Assignee: Apache Spark

> Distributed Tracing in Spark
> ----------------------------
>
>                 Key: SPARK-21962
>                 URL: https://issues.apache.org/jira/browse/SPARK-21962
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Andrew Ash
>            Assignee: Apache Spark
>            Priority: Major
>
> Spark should support distributed tracing, which is the mechanism, widely popularized
by Google in the [Dapper Paper|https://research.google.com/pubs/pub36356.html], where network
requests have additional metadata used for tracing requests between services.
> This would be useful for me since I have OpenZipkin style tracing in my distributed application
up to the Spark driver, and from the executors out to my other services, but the link is broken
in Spark between driver and executor since the Span IDs aren't propagated across that link.
> An initial implementation could instrument the most important network calls with trace
ids (like launching and finishing tasks), and incrementally add more tracing to other calls
(torrent block distribution, external shuffle service, etc) as the feature matures.
> Search keywords: Dapper, Brave, OpenZipkin, HTrace



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message