spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: Spark Performance on Yarn
Date Fri, 20 Feb 2015 19:50:40 GMT
Hi Kelvin,

spark.executor.memory controls the size of the executor heaps.

spark.yarn.executor.memoryOverhead is the amount of memory to request from
YARN beyond the heap size.  This accounts for the fact that JVMs use some
non-heap memory.

The Spark heap is divided into spark.storage.memoryFraction (default 0.6)
and spark.shuffle.memoryFraction (default 0.2), and the rest is for basic
Spark bookkeeping and anything the user does inside UDFs.

-Sandy



On Fri, Feb 20, 2015 at 11:44 AM, Kelvin Chu <2dot7kelvin@gmail.com> wrote:

> Hi Sandy,
>
> I am also doing memory tuning on YARN. Just want to confirm, is it correct
> to say:
>
> spark.executor.memory - spark.yarn.executor.memoryOverhead = the memory I
> can actually use in my jvm application
>
> If it is not, what is the correct relationship? Any other variables or
> config parameters in play? Thanks.
>
> Kelvin
>
> On Fri, Feb 20, 2015 at 9:45 AM, Sandy Ryza <sandy.ryza@cloudera.com>
> wrote:
>
>> If that's the error you're hitting, the fix is to boost
>> spark.yarn.executor.memoryOverhead, which will put some extra room in
>> between the executor heap sizes and the amount of memory requested for them
>> from YARN.
>>
>> -Sandy
>>
>> On Fri, Feb 20, 2015 at 9:40 AM, lbierman <leebierman@gmail.com> wrote:
>>
>>> A bit more context on this issue. From the container logs on the executor
>>>
>>> Given my cluster specs above what would be appropriate parameters to pass
>>> into :
>>> --num-executors --num-cores --executor-memory
>>>
>>> I had tried it with --executor-memory 2500MB
>>>
>>> 015-02-20 06:50:09,056 WARN
>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
>>> Container [pid=23320,containerID=container_1423083596644_0238_01_004160]
>>> is
>>> running beyond physical memory limits. Current usage: 2.8 GB of 2.7 GB
>>> physical memory used; 4.4 GB of 5.8 GB virtual memory used. Killing
>>> container.
>>> Dump of the process-tree for container_1423083596644_0238_01_004160 :
>>>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
>>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>>>         |- 23320 23318 23320 23320 (bash) 0 0 108650496 305 /bin/bash -c
>>> /usr/java/latest/bin/java -server -XX:OnOutOfMemoryError='kill %p'
>>> -Xms2400m
>>> -Xmx2400m
>>>
>>> -Djava.io.tmpdir=/dfs/yarn/nm/usercache/root/appcache/application_1423083596644_0238/container_1423083596644_0238_01_004160/tmp
>>>
>>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160
>>> org.apache.spark.executor.CoarseGrainedExecutorBackend
>>> akka.tcp://sparkDriver@ip-10-168-86-13.ec2.internal
>>> :42535/user/CoarseGrainedScheduler
>>> 8 ip-10-99-162-56.ec2.internal 1 application_1423083596644_0238 1>
>>>
>>> /var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160/stdout
>>> 2>
>>>
>>> /var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160/stderr
>>>         |- 23323 23320 23320 23320 (java) 922271 12263 4612222976 724218
>>> /usr/java/latest/bin/java -server -XX:OnOutOfMemoryError=kill %p
>>> -Xms2400m
>>> -Xmx2400m
>>>
>>> -Djava.io.tmpdir=/dfs/yarn/nm/usercache/root/appcache/application_1423083596644_0238/container_1423083596644_0238_01_004160/tmp
>>>
>>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160
>>> org.apache.spark.executor.CoarseGrainedExecutorBackend
>>> akka.tcp://sparkDriver@ip-10-168-86-13.ec2.internal:42535/user/Coarse
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Performance-on-Yarn-tp21729p21739.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>

Mime
View raw message