I have 26gb for driver and the master is running on m3.2xlarge machines.
I see OOM errors on workers and even they are running with 26th of memory.
HiIn broadcast, spark will collect the whole 3gb object into master node and broadcast to each slaves. It is very common situation that the master node don't have enough memory .What is your master node settings?Wisely Chen
Ankur Srivastava <email@example.com> 於 2015年3月28日 星期六寫道：
I have increased the "spark.storage.memoryFraction" to 0.4 but I still get OOM errors on Spark Executor nodes
15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block broadcast_5_piece10
15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5 took 2704 ms
15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called with curMem=2484698683, maxMem=9631778734
15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 641.4 MB, free 6.0 GB)
15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0 (TID 4007)
java.lang.OutOfMemoryError: GC overhead limit exceeded
AnkurOn Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava <firstname.lastname@example.org> wrote:
I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have given 26gb of memory with all 8 cores to my executors. I can see that in the logs too:
15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added: app-20150327213106-0000/0 on worker-20150327212934-10.x.y.z-40128 (10.x.y.z:40128) with 8 cores
I am not caching any RDD so I have set "spark.storage.memoryFraction" to 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.
I am now confused with these logs?
15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block manager 10.77.100.196:58407 with 4.5 GB RAM, BlockManagerId(4, 10.x.y.z, 58407)
I am broadcasting a large object of 3 gb and after that when I am creating an RDD, I see logs which show this 4.5 GB memory getting full and then I get OOM.
How can I make block manager use more memory?
Is there any other fine tuning I need to do for broadcasting large objects?
And does broadcast variable use cache memory or rest of the heap?