spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From akhandeshi <ami.khande...@gmail.com>
Subject OOM - Requested array size exceeds VM limit
Date Mon, 03 Nov 2014 21:44:52 GMT
I am running local (client).  My vm is 16 cpu/108gb ram. My configuration is
as following:

spark.executor.extraJavaOptions  -XX:+PrintGCDetails -XX:+UseCompressedOops
-XX:+UseParallelGC -XX:+UseParallelOldGC -XX:+DisableExplicitGC
-XX:MaxPermSize=1024m 

spark.daemon.memory=20g
spark.driver.memory=20g
spark.executor.memory=20g

export SPARK_DAEMON_JAVA_OPTS="-XX:+UseConcMarkSweepGC -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -XX:+UseCompressedOops -XX:+UseParallelGC
-XX:+UseParallelOldGC -XX:+DisableExplicitGC -XX:MaxPermSize=1024m"

/usr/local/spark-1.1.0/bin/spark-submit --class main.java.MyAppMainProcess
--master local[32]  MyApp.jar >> myapp.out

/11/03 20:45:43 INFO BlockManager: Removing block broadcast_4
14/11/03 20:45:43 INFO MemoryStore: Block broadcast_4 of size 3872 dropped
from memory (free 16669590422)
14/11/03 20:45:43 INFO ContextCleaner: Cleaned broadcast 4
14/11/03 20:46:00 WARN BlockManager: Putting block rdd_19_5 failed
14/11/03 20:46:00 ERROR Executor: Exception in task 5.0 in stage 3.0 (TID
70)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at java.util.Arrays.copyOf(Arrays.java:2271)
        at
java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
        at
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
        at
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
        at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at
java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
        at
java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
        at
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1188)
        at
java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
        at
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
        at
org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
        at
org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1047)
        at
org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1056)
        at
org.apache.spark.storage.TachyonStore.putIterator(TachyonStore.scala:60)
        at
org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:743)
        at
org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:594)
        at
org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
        at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:54)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745

It is hard to see from this output what stage it fails, but the output is
saving  textFile.  Individual record (key, value or key and value is
relatively small, but number of records in the collection is large.)  There
seems to be a bottleneck that I have run into that I can't seem to get pass. 
Any pointers in the right direction will be helpful!

Thanks,
Ami



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/OOM-Requested-array-size-exceeds-VM-limit-tp17996.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message