Performance Portal for Apache Spark

Description


Each data point represents each workload runtime percent compared with the previous week. Different lines represents different workloads running on spark yarn-client mode.

Hardware


CPU type: Intel® Xeon® CPU E5-2697 v2 @ 2.70GHz 
Memory: 128GB
NIC: 10GbE
Disk(s): 8 x 1TB SATA HDD

Software


JAVA version: 1.8.0_25
Hadoop version: 2.5.0-CDH5.3.2
HiBench version: 4.0
Spark on yarn-client mode

Cluster


1 node for Master
10 nodes for Slave

Regular

Summary

The lower percent the better performance.


Group

ww22

ww23

ww24

ww25

ww26

ww27

ww28

ww29

HiBench

6.0%

7.9%

-6.5%

-3.1%

-2.1%

-6.4%

-2.7%

-0.7%

spark-perf

-1.8%

4.1%

-4.7%

-4.6%

-5.4%

-4.6%

-12.8%

-12.5%

http://01org.github.io/sparkscore/image/plaf1.time/overall.png
Y-Axis: normalized completion time; X-Axis: Work Week. 
The commit number can be found in the result table.
The performance score for each workload is normalized based on the elapsed time for 1.2 release. The lower the better.

Detail


HiBench


JOB

ww22

ww23

ww24

ww25

ww26

ww27

ww28

ww29

commit

530efe3e

90c60692

db81b9d8

4eb48ed1

32e3cdaa

ec784381

2b820f2a

c472eb17

sleep

-2.1%

-2.9%

-4.1%

12.8%

-5.1%

-4.5%

-3.1%

-0.7%

wordcount

8.0%

8.3%

-18.6%

-10.9%

6.9%

-12.9%

-10.0%

-9.2%

kmeans

72.1%

92.9%

86.9%

95.8%

123.3%

99.3%

127.9%

102.6%

scan

%

-1.1%

-25.5%

-21.0%

-12.4%

-19.8%

-19.7%

-20.5%

bayes

-18.3%

-11.1%

-29.7%

-31.3%

-30.9%

-31.1%

-31.0%

-30.1%

aggregation

%

9.2%

-15.3%

-15.0%

-37.6%

-37.0%

-37.3%

7.6%

join

%

1.0%

-12.7%

-13.9%

-16.4%

-17.8%

-14.8%

-13.2%

sort

-11.9%

-12.5%

-17.5%

-17.3%

-20.7%

-17.7%

-13.9%

-15.6%

pagerank

4.0%

2.9%

-11.4%

-13.0%

-11.4%

-10.1%

-12.0%

-11.7%

terasort

-9.5%

-7.3%

-16.7%

-17.0%

-16.3%

-11.9%

-13.1%

-15.7%

Comments: null means no such workload running or workload failed in this time.

http://01org.github.io/sparkscore/image/plaf1.time/HiBench_workloads.png
Y-Axis: normalized completion time; X-Axis: Work Week. 
The commit number can be found in the result table.
The performance score for each workload is normalized based on the elapsed time for 1.2 release. The lower the better.

spark-perf


JOB

ww22

ww23

ww24

ww25

ww26

ww27

ww28

ww29

commit

530efe3e

90c60692

db81b9d8

4eb48ed1

32e3cdaa

ec784381

2b820f2a

c472eb17

agg

%

18.3%

5.2%

2.5%

1.1%

3.0%

-18.8%

-19.0%

agg-int

%

9.6%

4.0%

8.2%

7.0%

7.5%

6.2%

11.4%

agg-naive

%

-0.8%

-6.7%

-6.8%

-8.5%

-6.9%

-15.5%

-18.0%

scheduling

-14.5%

-2.1%

-6.4%

-6.5%

-5.7%

-1.8%

-6.0%

-9.7%

count-filter

6.6%

6.8%

-10.2%

-10.4%

-9.8%

-10.4%

-18.0%

-17.4%

count

6.7%

8.0%

-7.3%

-7.0%

-8.0%

-7.4%

-15.1%

-14.3%

sort

-6.2%

-7.0%

-14.6%

-14.4%

-13.9%

-15.9%

-24.0%

-23.2%

sort-int

-1.6%

-0.1%

-1.5%

-2.2%

-5.3%

-5.0%

-11.3%

-9.6%

Comments: null means no such workload running or workload failed in this time.

http://01org.github.io/sparkscore/image/plaf1.time/spark-perf_workloads.png
Y-Axis: normalized completion time; X-Axis: Work Week. 
The commit number can be found in the result table.
The performance score for each workload is normalized based on the elapsed time for 1.2 release. The lower the better.