spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denny Lee <>
Subject Re: spark-submit on YARN is slow
Date Sat, 06 Dec 2014 06:34:51 GMT
Sorry for the delay in my response - for my spark calls for stand-alone and
YARN, I am using the --executor-memory and --total-executor-cores for the
submission.  In standalone, my baseline query completes in ~40s while in
YARN, it completes in ~1800s.  It does not appear from the RM web UI that
its asking for more resources than available but by the same token, it
appears that its only using a small amount of cores and available memory.

Saying this, let me re-try using the --executor-cores, --executor-memory,
and --num-executors arguments as suggested (and documented) vs. the

On Fri Dec 05 2014 at 1:14:53 PM Andrew Or <> wrote:

> Hey Arun I've seen that behavior before. It happens when the cluster
> doesn't have enough resources to offer and the RM hasn't given us our
> containers yet. Can you check the RM Web UI at port 8088 to see whether
> your application is requesting more resources than the cluster has to offer?
> 2014-12-05 12:51 GMT-08:00 Sandy Ryza <>:
> Hey Arun,
>> The sleeps would only cause maximum like 5 second overhead.  The idea was
>> to give executors some time to register.  On more recent versions, they
>> were replaced with the spark.scheduler.minRegisteredResourcesRatio and
>> spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by default
>> YARN will wait until either 30 seconds have passed or 80% of the requested
>> executors have registered.
>> -Sandy
>> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <>
>> wrote:
>>> Likely this not the case here yet one thing to point out with Yarn
>>> parameters like --num-executors is that they should be specified *before*
>>> app jar and app args on spark-submit command line otherwise the app only
>>> gets the default number of containers which is 2.
>>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <> wrote:
>>>> Hi Denny,
>>>> Those sleeps were only at startup, so if jobs are taking significantly
>>>> longer on YARN, that should be a different problem.  When you ran on YARN,
>>>> did you use the --executor-cores, --executor-memory, and --num-executors
>>>> arguments?  When running against a standalone cluster, by default Spark
>>>> will make use of all the cluster resources, but when running against YARN,
>>>> Spark defaults to a couple tiny executors.
>>>> -Sandy
>>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <>
>>>> wrote:
>>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>>>> steps. If I was running this on standalone cluster mode the query finished
>>>>> in 55s but on YARN, the query was still running 30min later. Would the
>>>>> coded sleeps potentially be in play here?
>>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <>
>>>>> wrote:
>>>>>> Hi Tobias,
>>>>>> What version are you using?  In some recent versions, we had a couple
>>>>>> of large hardcoded sleeps on the Spark side.
>>>>>> -Sandy
>>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <>
>>>>>> wrote:
>>>>>>> Hey Tobias,
>>>>>>> As you suspect, the reason why it's slow is because the resource
>>>>>>> manager in YARN takes a while to grant resources. This is because
>>>>>>> needs to first set up the application master container, and then
this AM
>>>>>>> needs to request more containers for Spark executors. I think
this accounts
>>>>>>> for most of the overhead. The remaining source probably comes
from how our
>>>>>>> own YARN integration code polls application (every second) and
>>>>>>> resource states (every 5 seconds IIRC). I haven't explored in
>>>>>>> whether there are optimizations there that can speed this up,
but I believe
>>>>>>> most of the overhead comes from YARN itself.
>>>>>>> In other words, no I don't know of any quick fix on your end
>>>>>>> you can do to speed this up.
>>>>>>> -Andrew
>>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <>:
>>>>>>> Hi,
>>>>>>>> I am using spark-submit to submit my application to YARN
>>>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file
as well as my
>>>>>>>> application jar file put in HDFS and can see from the logging
output that
>>>>>>>> both files are used from there. However, it still takes about
10 seconds
>>>>>>>> for my application's yarnAppState to switch from ACCEPTED
>>>>>>>> I am aware that this is probably not a Spark issue, but some
>>>>>>>> configuration setting (or YARN-inherent slowness), I was
just wondering if
>>>>>>>> anyone has an advice for how to speed this up.
>>>>>>>> Thanks
>>>>>>>> Tobias

View raw message