spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Madhu <ma...@madhu.com>
Subject Re: Configuring Spark for reduceByKey on on massive data sets
Date Sun, 18 May 2014 00:45:10 GMT
Daniel,

How many partitions do you have?
Are they more or less uniformly distributed?
We have similar data volume currently running well on Hadoop MapReduce with
roughly 30 nodes. 
I was planning to test it with Spark. 
I'm very interested in your findings. 



-----
Madhu
https://www.linkedin.com/in/msiddalingaiah
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Configuring-Spark-for-reduceByKey-on-on-massive-data-sets-tp5966p5967.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message