Hi Deep,

Compute times may not be very meaningful for small examples like those.  If you increase the sizes of the examples, then you may start to observe more meaningful trends and speedups.


On Sat, Feb 28, 2015 at 7:26 AM, Deep Pradhan <pradhandeep1991@gmail.com> wrote:
I am running Spark applications in GCE. I set up cluster with different number of nodes varying from 1 to 7. The machines are single core machines. I set the spark.default.parallelism to the number of nodes in the cluster for each cluster. I ran the four applications available in Spark Examples, SparkTC, SparkALS, SparkLR, SparkPi for each of the configurations. 
What I notice is the following:
In case of SparkTC and SparkALS, the time to complete the job increases with the increase in number of nodes in cluster, where as in SparkLR and SparkPi, the time to complete the job remains the same across all the configurations.
Could anyone explain me this?

Thank You