spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Davidson (JIRA)" <>
Subject [jira] [Commented] (SPARK-4740) Netty's network throughput is about 1/2 of NIO's in spark-perf sortByKey
Date Fri, 05 Dec 2014 03:42:12 GMT


Aaron Davidson commented on SPARK-4740:

To clarify, we have two hypotheses currently:

1. Something is weird about transferTo that actually makes it less efficient than reading
the whole thing into memory in this situation.
2. The fact that we only have 1 connection (and thus serving thread) per peer is causing us
to only concurrently access up to 3 disks at once, though we're not sure if NIO is using more
than 3 threads to serve either.

If we rule out transferTo and it turns out NIO is using more than 3 threads to serve, it is
likely that we should try making TransportClientFactory able to produce more than 1 TransportClient
per host for situations where the number of Executors is much less than the number of cores
per Executor.

> Netty's network throughput is about 1/2 of NIO's in spark-perf sortByKey
> ------------------------------------------------------------------------
>                 Key: SPARK-4740
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Zhang, Liye
>         Attachments: Spark-perf Test Report.pdf, TestRunner  sort-by-key - Thread dump
for executor 1_files (48 Cores per node).zip
> When testing current spark master (1.3.0-snapshot) with spark-perf (sort-by-key, aggregate-by-key,
etc), Netty based shuffle transferService takes much longer time than NIO based shuffle transferService.
The network throughput of Netty is only about half of that of NIO. 
> We tested with standalone mode, and the data set we used for test is 20 billion records,
and the total size is about 400GB. Spark-perf test is Running on a 4 node cluster with 10G
NIC, 48 cpu cores per node and each executor memory is 64GB. The reduce tasks number is set
to 1000. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message