spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pete Robbins <>
Subject Re: SparkOscope: Enabling Spark Optimization through Cross-stack Monitoring and Visualization
Date Fri, 05 Feb 2016 09:09:08 GMT

I'm interested in what you've done here as I was looking for ways to allow
the Spark UI to display custom metrics in a pluggable way without having to
modify the Spark source code. It would be good to see if we could have
modify your code to add extension points into the UI so we could configure
sources of the additional metrics. So for instance rather than creating
events from your HDFS files I would like to have a module that is pulling
in system/jvm metrics that are in eg Elasticsearch.

Do any of the Spark committers have any thoughts on this?


On 3 February 2016 at 15:26, Yiannis Gkoufas <> wrote:

> Hi all,
> I just wanted to introduce some of my recent work in IBM Research around
> Spark and especially its Metric System and Web UI.
> As a quick overview of our contributions:
> We have a created a new type of Sink for the metrics ( HDFSSink ) which
> captures the metrics into HDFS,
> We have extended the metrics reported by the Executors to include OS-level
> metrics regarding CPU, RAM, Disk IO, Network IO utilizing the Hyperic Sigar
> library
> We have extended the Web UI for the completed applications to visualize
> any of the above metrics the user wants to.
> The above functionalities can be configured in the and
> spark-defaults.conf files.
> We have recorded a small demo that shows those capabilities which you can
> find here :
> There is a blog post which gives more details on the functionality here:
> **
> <>
> and also there is a public repo where anyone can try it:
> **
> <>
> I would really appreciate any feedback or advice regarding this work.
> Especially if you think it's worth it to upstream to the official Spark
> repository.
> Thanks a lot!

View raw message