spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Wilcke <1wil...@informatik.uni-hamburg.de>
Subject Adaptive behavior of Spark at different network transfer rates?
Date Mon, 13 Jul 2015 14:21:04 GMT
Hello,

I'm facing a strange behavior regarding a larger data processing
pipeline consisting of multiple steps involving Spark core and GraphX.
Increasing the network transfer rate in the 5 node cluster from 100
Mbit/s to 1 Gbit/s the runtime also increases from around 15 minutes to
19 Minutes. This only holds for large input files. On small files the
faster transfer rate decreases the runtime by around one third.

I tested the network transfer rate by transmitting files from node to
node. On 100 Mbit/s I get 11,7 MByte/s and on 1 Gbit/s I get 67 MByte/s.
For that reason the network itself should not be the reason.

My question is. Does Spark and especially GraphX adapt its behavior to
the available network transfer rate? Does anybody have an idea how a
faster network could decrease the performance?

Thank you very much!

Kind regards,
Niklas Wilcke



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message