hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Suresh <asur...@apache.org>
Subject Re: Launching containers asynchronously
Date Fri, 13 Oct 2017 21:14:13 GMT
Hello Craig

Thanks for trying this. Asynchronous scheduling (in the up-until-now
released 2.x branches of YARN) is fairly experimental, and it does lead to
some unnecessary locking and race conditions.
Wangda has re-factored most of the asynchronous scheduling code paths and
it should be available in 2.9.0 and you can give it a shot in 3.0.0-beta1
as well.

The default scheduling mode (what you refer to as synchronous scheduling)
is actually Node heartbeat triggered scheduling. There are certain cases
where I guess the default scheduling might still be more apt. For eg, if
most of your requests have stricter Data locality requirements. Also, in a
slightly pegged cluster, I suspect you might see higher latencies - I have
yet to test this though.

But in general, it is direction we are actively looking at. BTW, for
extremely short duration tasks, there is also an option to use
OPPORTUNISTIC containers (https://issues.apache.org/jira/browse/YARN-2877
and https://issues.apache.org/jira/browse/YARN-5220) but you need to have
support in the AM for that.

Cheers
-Arun

On Fri, Oct 13, 2017 at 11:30 AM, Craig Ingram <cinple.sln@gmail.com> wrote:

> I was recently doing some research into Spark on YARN's startup time and
> observed slow, synchronous allocation of containers/executors. I am testing
> on a 4 node bare metal cluster w/48 cores and 128GB memory per node. YARN
> was only allocating about 3 containers per second. Moreover when starting 3
> Spark applications at the same time with each requesting 44 containers, the
> first application would get all 44 requested containers and then the next
> application would start getting containers and so on.
>
> From looking at the code, it appears this is by design. There is an
> undocumented configuration variable that will enable asynchronous
> allocation of containers. I'm sure I'm missing something, but why is this
> not the default? Is there a bug or race condition in this code path? I've
> done some testing with it and it's been working and is significantly
> faster.
>
> Here's the config:
> `yarn.scheduler.capacity.schedule-asynchronously.enable`
>
> Any help understanding this would be appreciated.
>
> Thanks,
> Craig
>
>
> If you're curious about the performance difference with this setting, here
> are the results:
>
> The following tool was used for the benchmarks:
> https://github.com/SparkTC/spark-bench
>
> # async scheduler research
> The goal of this test is to determine if running Spark on YARN with async
> scheduling of containers reduces the amount of time required for an
> application to receive all of its requested resources. This setting should
> also reduce the overall runtime of short-lived applications/stages or
> notebook paragraphs. This setting could prove crucial to achieving optimal
> performance when sharing resources on a cluster with dynalloc enabled.
> ## Test Setup
> Must update /etc/hadoop/conf/capacity-scheduler.xml (or through Ambari)
> between runs.
> `yarn.scheduler.capacity.schedule-asynchronously.enable=true|false`
>
> conf files request executors counts of:
> * 2
> * 20
> * 50
> * 100
> The apps are being submitted to the default queue on each cluster which
> caps at 48 cores on dynalloc and 72 cores on baremetal. The default queue
> was expanded for the last two tests on baremetal so it could potentially
> take advantage of all 144 cores.
> ## Test Environments
> ### dynalloc
> 4 VMs in Fyre (1 master, 3 workers)
> 8 CPUs/16 GB per node
> model name    : QEMU Virtual CPU version 2.5+
> ### baremetal
> 4 baremetal instances in Fyre (1 master, 3 workers)
> 48 CPUs/128GB per node
> model name    : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
>
> ## Using spark-bench with timedsleep workload sync
> ### dynalloc
> conf | avg | stddev
> --- | --- | ---
> spark-on-yarn-schedule-async0.time | 23.814900 | 1.110725
> spark-on-yarn-schedule-async1.time | 29.770250 | 0.830528
> spark-on-yarn-schedule-async2.time | 44.486600 | 0.593516
> spark-on-yarn-schedule-async3.time | 44.337700 | 0.490139
> ### baremetal - 2 queues splitting cluster 72 cores each
> conf | avg | stddev
> --- | --- | ---
> spark-on-yarn-schedule-async0.time | 14.827000 | 0.292290
> spark-on-yarn-schedule-async1.time | 19.613150 | 0.155421
> spark-on-yarn-schedule-async2.time | 30.768400 | 0.083400
> spark-on-yarn-schedule-async3.time | 40.931850 | 0.092160
> ### baremetal - 1 queue to rule them all - 144 cores
> conf | avg | stddev
> --- | --- | ---
> spark-on-yarn-schedule-async0.time | 14.833050 | 0.334061
> spark-on-yarn-schedule-async1.time | 19.575000 | 0.212836
> spark-on-yarn-schedule-async2.time | 30.765350 | 0.111035
> spark-on-yarn-schedule-async3.time | 41.763300 | 0.182700
>
> ## Using spark-bench with timedsleep workload async
> ### dynalloc
> conf | avg | stddev
> --- | --- | ---
> spark-on-yarn-schedule-async0.time | 22.575150 | 0.574296
> spark-on-yarn-schedule-async1.time | 26.904150 | 1.244602
> spark-on-yarn-schedule-async2.time | 44.721800 | 0.655388
> spark-on-yarn-schedule-async3.time | 44.570000 | 0.514540
> #### 2nd run
> conf | avg | stddev
> --- | --- | ---
> spark-on-yarn-schedule-async0.time | 22.441200 | 0.715875
> spark-on-yarn-schedule-async1.time | 26.683400 | 0.583762
> spark-on-yarn-schedule-async2.time | 44.227250 | 0.512568
> spark-on-yarn-schedule-async3.time | 44.238750 | 0.329712
> ### baremetal - 2 queues splitting cluster 72 cores each
> conf | avg | stddev
> --- | --- | ---
> spark-on-yarn-schedule-async0.time | 12.902350 | 0.125505
> spark-on-yarn-schedule-async1.time | 13.830600 | 0.169598
> spark-on-yarn-schedule-async2.time | 16.738050 | 0.265091
> spark-on-yarn-schedule-async3.time | 40.654500 | 0.111417
> ### baremetal - 1 queue to rule them all - 144 cores
> conf | avg | stddev
> --- | --- | ---
> spark-on-yarn-schedule-async0.time | 12.987150 | 0.118169
> spark-on-yarn-schedule-async1.time | 13.837150 | 0.145871
> spark-on-yarn-schedule-async2.time | 16.816300 | 0.253437
> spark-on-yarn-schedule-async3.time | 23.113450 | 0.320744
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message