spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: sort order after reduceByKey / groupByKey
Date Thu, 20 Mar 2014 19:26:35 GMT
Thats expected. I think sortByKey is option too & probably a better one.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Thu, Mar 20, 2014 at 3:20 PM, Ameet Kini <ameetkini@gmail.com> wrote:

>
> val rdd2 = rdd.partitionBy(my partitioner).reduceByKey(some function)
>
> I see that rdd2's partitions are not internally sorted. Can someone
> confirm that this is expected behavior? And if so, the only way to get
> partitions internally sorted is to follow it with something like this
>
> val rdd2 = rdd.partitionBy(my partitioner).reduceByKey(some
> function).mapPartitions(p => sort(p))
>
> Thanks,
> Ameet
>
>

Mime
View raw message