spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Frampton <>
Subject [General Question] [Hadoop + Spark at scale] Spark Rack Awareness ?
Date Sun, 19 Jul 2015 01:25:31 GMT
I wanted to ask a general question about Hadoop/Yarn and Apache Spark integration. I know that

Hadoop on a physical cluster has rack awareness. i.e. It attempts to minimise network traffic

by saving replicated blocks within a rack. i.e. 

I wondered whether, when Spark is configured to use Yarn as a cluster manager, it is able
use this feature to also minimise network traffic to a degree. 

Sorry if this questionn is not quite accurate but I think you can generally see what I mean
View raw message