spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "deenar.toraskar" <deenar.toras...@db.com>
Subject Equally weighted partitions in Spark
Date Thu, 01 May 2014 15:30:37 GMT
Hi

I am using Spark to distribute computationally intensive tasks across the
cluster. Currently I partition my RDD of tasks randomly. There is a large
variation in how long each of the jobs take to complete, leading to most
partitions being processed quickly and a couple of partitions take forever
to complete. I can mitigate this problem by increasing the number of
partitions to some extent.

Ideally i would like to partition tasks by complexity (Let's assume I can
get such a value from the task object) such that each sum of complexity in
of elements in each partition evenly distributed. Has anyone created such a
partitioner before? 


Regards
Deenar



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Equally-weighted-partitions-in-Spark-tp5171.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message