spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takeshi Yamamuro <linguin....@gmail.com>
Subject Re: Custom RDD: Report Size of Partition in Bytes to Spark
Date Mon, 04 Jul 2016 04:31:29 GMT
How about using `SparkListener`?
You can collect IO statistics thru TaskMetrics#inputMetrics by yourself.

// maropu

On Mon, Jul 4, 2016 at 11:46 AM, Pedro Rodriguez <ski.rodriguez@gmail.com>
wrote:

> Hi All,
>
> I noticed on some Spark jobs it shows you input/output read size. I am
> implementing a custom RDD which reads files and would like to report these
> metrics to Spark since they are available to me.
>
> I looked through the RDD source code and a couple different
> implementations and the best I could find were some Hadoop metrics. Is
> there a way to simply report the number of bytes a partition read so Spark
> can put it on the UI?
>
> Thanks,
> —
> Pedro Rodriguez
> PhD Student in Large-Scale Machine Learning | CU Boulder
> Systems Oriented Data Scientist
> UC Berkeley AMPLab Alumni
>
> pedrorodriguez.io | 909-353-4423
> github.com/EntilZha | LinkedIn
> <https://www.linkedin.com/in/pedrorodriguezscience>
>



-- 
---
Takeshi Yamamuro

Mime
View raw message