spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Darabos <>
Subject Is shuffle "stable"?
Date Sat, 14 Jun 2014 19:14:43 GMT
What I mean is, let's say I run this:

sc.parallelize(Seq(0->3, 0->2, 0->1), 3).partitionBy(HashPartitioner(3)).collect

Will the result always be Array((0,3), (0,2), (0,1))? Or could I
possibly get a different order?

I'm pretty sure the shuffle files are taken in the order of the source
partitions... But after much search, and the discussion on
I still can't find the code that does this.


View raw message