spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Umar Javed <umarj.ja...@gmail.com>
Subject time taken to fetch input partition by map
Date Thu, 21 Nov 2013 05:20:26 GMT
Hi,

The metrics provide information for the reduce (i.e. shuffleReaders) tasks
about the time taken to fetch the shuffle outputs. Is there a way I can
find out the the time taken by a map task (ie shuffleWriter) on a remote
machine to read its input partition from disk?

I believe I should look in HadoopRDD.scala where there is the
getRecordReader, and the headers show that it should be
in org.apache.hadoop.mapred.RecordReader, but I can't find that file
anywhere.

Any help would be appreciated.

thanks!
Umar

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message