spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghavendra Pandey <raghavendra.pan...@gmail.com>
Subject Re: Optimizations
Date Fri, 03 Jul 2015 15:23:36 GMT
This is the basic design of spark that it runs all actions in different
stages...
Not sure you can achieve what you r looking for.
On Jul 3, 2015 12:43 PM, "Marius Danciu" <marius.danciu@gmail.com> wrote:

> Hi all,
>
> If I have something like:
>
> rdd.join(...).mapPartitionToPair(...)
>
> It looks like mapPartitionToPair runs in a different stage then join. Is
> there a way to piggyback this computation inside the join stage ? ... such
> that each result partition after join is passed to
> the mapPartitionToPair function, all running in the same state without any
> other costs.
>
> Best,
> Marius
>

Mime
View raw message