spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuval Degani (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine
Date Mon, 09 Oct 2017 19:31:00 GMT
Yuval Degani created SPARK-22229:
------------------------------------

             Summary: SPIP: RDMA Accelerated Shuffle Engine
                 Key: SPARK-22229
                 URL: https://issues.apache.org/jira/browse/SPARK-22229
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.3.0
            Reporter: Yuval Degani


An RDMA-accelerated shuffle engine can provide enormous performance benefits to shuffle-intensive
Spark jobs, as demonstrated in the “SparkRDMA” plugin open-source project ([https://github.com/Mellanox/SparkRDMA]).
Using RDMA for shuffle improves CPU utilization significantly and reduces I/O processing overhead
by bypassing the kernel and networking stack as well as avoiding memory copies entirely. Those
valuable CPU cycles are then consumed directly by the actual Spark workloads, and help reducing
the job runtime significantly. 
This performance gain is demonstrated with both industry standard HiBench TeraSort (shows
1.5x speedup in sorting) as well as shuffle intensive customer applications. 
SparkRDMA will be presented at Spark Summit 2017 in Dublin ([https://spark-summit.org/eu-2017/events/accelerating-shuffle-a-tailor-made-rdma-solution-for-apache-spark/])



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message