spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ameet Kini <ameetk...@gmail.com>
Subject Re: sort order after reduceByKey / groupByKey
Date Thu, 20 Mar 2014 19:32:01 GMT
I saw that but I don't need a global sort, only intra-partition sort.

Ameet


On Thu, Mar 20, 2014 at 3:26 PM, Mayur Rustagi <mayur.rustagi@gmail.com>wrote:

> Thats expected. I think sortByKey is option too & probably a better one.
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
>  @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Thu, Mar 20, 2014 at 3:20 PM, Ameet Kini <ameetkini@gmail.com> wrote:
>
>>
>> val rdd2 = rdd.partitionBy(my partitioner).reduceByKey(some function)
>>
>> I see that rdd2's partitions are not internally sorted. Can someone
>> confirm that this is expected behavior? And if so, the only way to get
>> partitions internally sorted is to follow it with something like this
>>
>> val rdd2 = rdd.partitionBy(my partitioner).reduceByKey(some
>> function).mapPartitions(p => sort(p))
>>
>> Thanks,
>> Ameet
>>
>>
>

Mime
View raw message