spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: How to time transformations and provide more detailed progress report?
Date Tue, 07 Jan 2014 18:14:30 GMT
>
> When we time an action it includes all the transformations timings too,
> and it is not clear which transformation takes how long. Is there a way of
> timing each transformation separately?


Not really, because even though you may logically specify several different
transformations within your Spark job, transformations within a single
stage will typically get pipelined into a single transformation, so
separate timing information for each logical transformation no longer makes
sense and is not available.  The best you are going to be able to do is
stage- and task-level information, with stages defined by shuffle
boundaries and tasks being units of work within a stage and on a particular
RDD partition.



On Tue, Jan 7, 2014 at 9:00 AM, Aureliano Buendia <buendia360@gmail.com>wrote:

> Hi,
>
> When we time an action it includes all the transformations timings too,
> and it is not clear which transformation takes how long. Is there a way of
> timing each transformation separately?
>
> Also, does spark provide a way of more detailed progress reporting, broken
> to transformation steps? For example, can the web ui progress report be
> broken into transformation steps, can we give each transformation step a
> name?
>

Mime
View raw message