spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Rodriguez <>
Subject Custom RDD: Report Size of Partition in Bytes to Spark
Date Mon, 04 Jul 2016 02:46:10 GMT
Hi All,

I noticed on some Spark jobs it shows you input/output read size. I am implementing a custom
RDD which reads files and would like to report these metrics to Spark since they are available
to me.

I looked through the RDD source code and a couple different implementations and the best I
could find were some Hadoop metrics. Is there a way to simply report the number of bytes a
partition read so Spark can put it on the UI?

Pedro Rodriguez
PhD Student in Large-Scale Machine Learning | CU Boulder
Systems Oriented Data Scientist
UC Berkeley AMPLab Alumni | 909-353-4423 | LinkedIn
View raw message