spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ameet Kini <ameetk...@gmail.com>
Subject sort order after reduceByKey / groupByKey
Date Thu, 20 Mar 2014 19:20:22 GMT
val rdd2 = rdd.partitionBy(my partitioner).reduceByKey(some function)

I see that rdd2's partitions are not internally sorted. Can someone confirm
that this is expected behavior? And if so, the only way to get partitions
internally sorted is to follow it with something like this

val rdd2 = rdd.partitionBy(my partitioner).reduceByKey(some
function).mapPartitions(p => sort(p))

Thanks,
Ameet

Mime
View raw message