spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: [General Question] [Hadoop + Spark at scale] Spark Rack Awareness ?
Date Sun, 19 Jul 2015 17:21:38 GMT
Hi Mike,

Spark is rack-aware in its task scheduling.  Currently Spark doesn't honor
any locality preferences when scheduling executors, but this is being
addressed in SPARK-4352, after which executor-scheduling will be rack-aware
as well.

-Sandy

On Sat, Jul 18, 2015 at 6:25 PM, Mike Frampton <mike_frampton@hotmail.com>
wrote:

> I wanted to ask a general question about Hadoop/Yarn and Apache Spark
> integration. I know that
> Hadoop on a physical cluster has rack awareness. i.e. It attempts to
> minimise network traffic
> by saving replicated blocks within a rack. i.e.
>
> I wondered whether, when Spark is configured to use Yarn as a cluster
> manager, it is able to
> use this feature to also minimise network traffic to a degree.
>
> Sorry if this questionn is not quite accurate but I think you can
> generally see what I mean ?
>

Mime
View raw message