spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harihar Nahak <>
Subject Is Spark? or GraphX runs fast? a performance comparison on Page Rank
Date Tue, 25 Nov 2014 03:02:08 GMT
Hi All, 

I started exploring Spark from past 2 months. I'm looking for some concrete
features from both Spark and GraphX so that I'll take some decisions what to
use, based upon who get highest performance. 

According to documentation GraphX runs 10x faster than normal Spark. So I
run Page Rank algorithm in both the applications: 
For Spark I used:
For GraphX I used :

Input data : (1 Gb in
No of Iterations : 2 

*Time Taken : *

Local Mode (Machine : 8 Core; 16 GB memory; 2.80 Ghz Intel i7; Executor
Memory: 4Gb, No. of Partition: 50; No. of Iterations: 2);   ==>  

*Spark Page Rank took -> 21.29 mins 
GraphX Page Rank took -> 42.01 mins *   
Cluster Mode (ubantu 12.4; spark 1.1/hadoop 2.4 cluster ; 3 workers , 1
driver , 8 cores, 30 gb memory) (Executor memory 4gb; No. of edge partitions
: 50, random vertex cut ; no. of iteration : 2) =>

*Spark Page Rank took -> 10.54 mins 
GraphX Page Rank took -> 7.54 mins * 

Could you please help me to determine, when to use Spark and GraphX ? If
GraphX took same amount of time than Spark then its better to use Spark
because spark has variey of operators to deal with any type of RDD. 

Any suggestions or feedback or pointers will highly appreciate



View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message