spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: spark-submit on YARN is slow
Date Sat, 06 Dec 2014 09:39:22 GMT
Great to hear!

-Sandy

On Fri, Dec 5, 2014 at 11:17 PM, Denny Lee <denny.g.lee@gmail.com> wrote:

> Okay, my bad for not testing out the documented arguments - once i use the
> correct ones, the query shrinks completes in ~55s (I can probably make it
> faster).   Thanks for the help, eh?!
>
>
>
> On Fri Dec 05 2014 at 10:34:50 PM Denny Lee <denny.g.lee@gmail.com> wrote:
>
>> Sorry for the delay in my response - for my spark calls for stand-alone
>> and YARN, I am using the --executor-memory and --total-executor-cores for
>> the submission.  In standalone, my baseline query completes in ~40s while
>> in YARN, it completes in ~1800s.  It does not appear from the RM web UI
>> that its asking for more resources than available but by the same token, it
>> appears that its only using a small amount of cores and available memory.
>>
>> Saying this, let me re-try using the --executor-cores,
>> --executor-memory, and --num-executors arguments as suggested (and
>> documented) vs. the --total-executor-cores
>>
>>
>> On Fri Dec 05 2014 at 1:14:53 PM Andrew Or <andrew@databricks.com> wrote:
>>
>>> Hey Arun I've seen that behavior before. It happens when the cluster
>>> doesn't have enough resources to offer and the RM hasn't given us our
>>> containers yet. Can you check the RM Web UI at port 8088 to see whether
>>> your application is requesting more resources than the cluster has to offer?
>>>
>>> 2014-12-05 12:51 GMT-08:00 Sandy Ryza <sandy.ryza@cloudera.com>:
>>>
>>> Hey Arun,
>>>>
>>>> The sleeps would only cause maximum like 5 second overhead.  The idea
>>>> was to give executors some time to register.  On more recent versions, they
>>>> were replaced with the spark.scheduler.minRegisteredResourcesRatio and
>>>> spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by
>>>> default YARN will wait until either 30 seconds have passed or 80% of the
>>>> requested executors have registered.
>>>>
>>>> -Sandy
>>>>
>>>> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <arangole@gmail.com>
>>>> wrote:
>>>>
>>>>> Likely this not the case here yet one thing to point out with Yarn
>>>>> parameters like --num-executors is that they should be specified *before*
>>>>> app jar and app args on spark-submit command line otherwise the app only
>>>>> gets the default number of containers which is 2.
>>>>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sandy.ryza@cloudera.com>
wrote:
>>>>>
>>>>>> Hi Denny,
>>>>>>
>>>>>> Those sleeps were only at startup, so if jobs are taking
>>>>>> significantly longer on YARN, that should be a different problem.
 When you
>>>>>> ran on YARN, did you use the --executor-cores, --executor-memory,
and
>>>>>> --num-executors arguments?  When running against a standalone cluster,
by
>>>>>> default Spark will make use of all the cluster resources, but when
running
>>>>>> against YARN, Spark defaults to a couple tiny executors.
>>>>>>
>>>>>> -Sandy
>>>>>>
>>>>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <denny.g.lee@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>>>>>> steps. If I was running this on standalone cluster mode the query
finished
>>>>>>> in 55s but on YARN, the query was still running 30min later.
Would the hard
>>>>>>> coded sleeps potentially be in play here?
>>>>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sandy.ryza@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Tobias,
>>>>>>>>
>>>>>>>> What version are you using?  In some recent versions, we
had a
>>>>>>>> couple of large hardcoded sleeps on the Spark side.
>>>>>>>>
>>>>>>>> -Sandy
>>>>>>>>
>>>>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <andrew@databricks.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hey Tobias,
>>>>>>>>>
>>>>>>>>> As you suspect, the reason why it's slow is because the
resource
>>>>>>>>> manager in YARN takes a while to grant resources. This
is because YARN
>>>>>>>>> needs to first set up the application master container,
and then this AM
>>>>>>>>> needs to request more containers for Spark executors.
I think this accounts
>>>>>>>>> for most of the overhead. The remaining source probably
comes from how our
>>>>>>>>> own YARN integration code polls application (every second)
and cluster
>>>>>>>>> resource states (every 5 seconds IIRC). I haven't explored
in detail
>>>>>>>>> whether there are optimizations there that can speed
this up, but I believe
>>>>>>>>> most of the overhead comes from YARN itself.
>>>>>>>>>
>>>>>>>>> In other words, no I don't know of any quick fix on your
end that
>>>>>>>>> you can do to speed this up.
>>>>>>>>>
>>>>>>>>> -Andrew
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tgp@preferred.jp>:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am using spark-submit to submit my application
to YARN in
>>>>>>>>>> "yarn-cluster" mode. I have both the Spark assembly
jar file as well as my
>>>>>>>>>> application jar file put in HDFS and can see from
the logging output that
>>>>>>>>>> both files are used from there. However, it still
takes about 10 seconds
>>>>>>>>>> for my application's yarnAppState to switch from
ACCEPTED to RUNNING.
>>>>>>>>>>
>>>>>>>>>> I am aware that this is probably not a Spark issue,
but some YARN
>>>>>>>>>> configuration setting (or YARN-inherent slowness),
I was just wondering if
>>>>>>>>>> anyone has an advice for how to speed this up.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Tobias
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>

Mime
View raw message