spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: Trying to make sense of the actual executed code
Date Thu, 07 Aug 2014 01:32:38 GMT
This is maybe not exactly what you are asking for, but you might consider
looking at the queryExecution (a developer API that shows how the query is
analyzed / executed)

sql("...").queryExecution


On Wed, Aug 6, 2014 at 3:55 PM, Tom <thubregtsen@gmail.com> wrote:

> Hi,
>
> I am trying to look at for instance the following SQL query in Spark 1.1:
> SELECT table.key, table.value, table2.value FROM table2 JOIN table WHERE
> table2.key = table.key
> When I look at the output, I see that there are several stages, and several
> tasks per stage. The tasks have a TID, I do not see such a thing for a
> stage. I see the input split of the files and start, running and finished
> messages for the tasks. But what I really want to know is the following:
> Which map, shuffle and reduces are performed in which order/where can I see
> the actual executed code per task/stage. In between files/rdd's would be a
> bonus!
>
> Thanks in advance,
>
> Tom
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Trying-to-make-sense-of-the-actual-executed-code-tp11594.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message