spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick <>
Subject groupByKey vs mapPartitions for efficient grouping within a Partition
Date Mon, 16 Jan 2017 14:21:39 GMT

Does groupByKey has intelligence associated with it, such that if all the
keys resides in the same partition, it should not do the shuffle?

Or user should write mapPartitions( scala groupBy code).

Which would be more efficient and what are the memory considerations?


View raw message