spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: flatMap followed by mapPartitions
Date Thu, 13 Nov 2014 04:35:12 GMT
flatmap would have to shuffle data only if output RDD is expected to be
partitioned by some key.
RDD[X].flatmap(X=>RDD[Y])
If it has to shuffle it should be local.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>


On Thu, Nov 13, 2014 at 7:31 AM, Debasish Das <debasish.das83@gmail.com>
wrote:

> Hi,
>
> I am doing a flatMap followed by mapPartitions to do some blocked
> operation...flatMap is shuffling data but this shuffle is strictly
> shuffling to disk and not over the network right ?
>
> Thanks.
> Deb
>

Mime
View raw message