spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: How does Spark SQL traverse the physical tree?
Date Mon, 24 Nov 2014 21:15:53 GMT
You are pretty close.  The QueryExecution is what drives the phases from
parsing to execution.  Once we have a final SparkPlan (the physical plan),
toRdd just calls execute() which recursively calls execute() on children
until we hit a leaf operator.  This gives us an RDD[Row] that will compute
the answer and from there the actual execution is driven by Spark Core.

On Mon, Nov 24, 2014 at 9:52 AM, Tim Chou <timchou.hit@gmail.com> wrote:

> Hi All,
>
> I'm learning the code of Spark SQL.
>
> I'm confused about how SchemaRDD executes each operator.
>
> I'm tracing the code. I found toRDD() function in QueryExecution is the
> start for running a query. toRDD function will run SparkPlan, which is a
> tree structure.
>
> However, I didn't find any iterative sentence in execute function for any
> detail operators. It seems Spark SQL will only run the top node in this
> tree.
>
> I know the conclusion is wrong.But which code have I missed?
>
> Thanks,
> Tim
>

Mime
View raw message