spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankur Srivastava <ankur.srivast...@gmail.com>
Subject Re: Understanding Spark Memory distribution
Date Sat, 28 Mar 2015 07:39:31 GMT
Hi Wisely,
I have 26gb for driver and the master is running on m3.2xlarge machines.

I see OOM errors on workers and even they are running with 26th of memory.

Thanks

On Fri, Mar 27, 2015, 11:43 PM Wisely Chen <wiselychen@appier.com> wrote:

> Hi
>
> In broadcast, spark will collect the whole 3gb object into master node and
> broadcast to each slaves. It is very common situation that the master node
> don't have enough memory .
>
> What is your master node settings?
>
> Wisely Chen
>
> Ankur Srivastava <ankur.srivastava@gmail.com> 於 2015年3月28日 星期六寫道:
>
> I have increased the "spark.storage.memoryFraction" to 0.4 but I still
>> get OOM errors on Spark Executor nodes
>>
>>
>> 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
>> broadcast_5_piece10
>>
>> 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5
>> took 2704 ms
>>
>> 15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called
>> with curMem=2484698683, maxMem=9631778734
>>
>> 15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values in
>> memory (estimated size 641.4 MB, free 6.0 GB)
>>
>> 15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts
>>
>> java.util.concurrent.TimeoutException: Futures timed out after [30
>> seconds]
>>
>>         at
>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>
>>         at
>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>>
>>         at
>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>>
>>         at
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>>
>>         at scala.concurrent.Await$.result(package.scala:107)
>>
>>         at
>> org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)
>>
>>         at
>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)
>>
>> 15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0 (TID
>> 4007)
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)
>>
>>         at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>
>>         at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>
>> Thanks
>>
>> Ankur
>>
>> On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava <
>> ankur.srivastava@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I am running a spark cluster on EC2 instances of type: m3.2xlarge. I
>>> have given 26gb of memory with all 8 cores to my executors. I can see that
>>> in the logs too:
>>>
>>> *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
>>> app-20150327213106-0000/0 on worker-20150327212934-10.x.y.z-40128
>>> (10.x.y.z:40128) with 8 cores*
>>>
>>> I am not caching any RDD so I have set "spark.storage.memoryFraction" to
>>> 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.
>>>
>>> I am now confused with these logs?
>>>
>>> *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block
>>> manager 10.77.100.196:58407 <http://10.77.100.196:58407> with 4.5 GB RAM,
>>> BlockManagerId(4, 10.x.y.z, 58407)*
>>>
>>> I am broadcasting a large object of 3 gb and after that when I am
>>> creating an RDD, I see logs which show this 4.5 GB memory getting full and
>>> then I get OOM.
>>>
>>> How can I make block manager use more memory?
>>>
>>> Is there any other fine tuning I need to do for broadcasting large
>>> objects?
>>>
>>> And does broadcast variable use cache memory or rest of the heap?
>>>
>>>
>>> Thanks
>>>
>>> Ankur
>>>
>>
>>

Mime
View raw message