spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <>
Subject Re: How does Spark SQL traverse the physical tree?
Date Mon, 24 Nov 2014 21:15:53 GMT
You are pretty close.  The QueryExecution is what drives the phases from
parsing to execution.  Once we have a final SparkPlan (the physical plan),
toRdd just calls execute() which recursively calls execute() on children
until we hit a leaf operator.  This gives us an RDD[Row] that will compute
the answer and from there the actual execution is driven by Spark Core.

On Mon, Nov 24, 2014 at 9:52 AM, Tim Chou <> wrote:

> Hi All,
> I'm learning the code of Spark SQL.
> I'm confused about how SchemaRDD executes each operator.
> I'm tracing the code. I found toRDD() function in QueryExecution is the
> start for running a query. toRDD function will run SparkPlan, which is a
> tree structure.
> However, I didn't find any iterative sentence in execute function for any
> detail operators. It seems Spark SQL will only run the top node in this
> tree.
> I know the conclusion is wrong.But which code have I missed?
> Thanks,
> Tim

View raw message