spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Umar Javed <>
Subject time taken to fetch input partition by map
Date Thu, 21 Nov 2013 05:20:26 GMT

The metrics provide information for the reduce (i.e. shuffleReaders) tasks
about the time taken to fetch the shuffle outputs. Is there a way I can
find out the the time taken by a map task (ie shuffleWriter) on a remote
machine to read its input partition from disk?

I believe I should look in HadoopRDD.scala where there is the
getRecordReader, and the headers show that it should be
in org.apache.hadoop.mapred.RecordReader, but I can't find that file

Any help would be appreciated.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message