spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Mocanu <>
Subject partitioning via groupByKey
Date Wed, 19 Mar 2014 16:32:43 GMT
When you partition via groupByKey tulpes (parts of the RDD) are moved from some node to another
node based on key (hash partitioning).
Do the tuples remain part of 1 RDD as before but moved to different nodes or does this shuffling
create, say, several RDDs which will have parts of the original RDD?


View raw message