spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Running "beyond memory limits" in ConnectedComponents
Date Wed, 14 Jan 2015 21:44:22 GMT
That's not quite what that error means. Spark is not out of memory. It
means that Spark is using more memory than it asked YARN for. That in
turn is because the default amount of cushion established between the
YARN allowed container size and the JVM heap size is too small. See
spark.yarn.executor.memoryOverhead in
http://spark.apache.org/docs/latest/running-on-yarn.html

On Wed, Jan 14, 2015 at 9:18 PM, nitinkak001 <nitinkak001@gmail.com> wrote:
> I am trying to run connected components algorithm in Spark. The graph has
> roughly 28M edges and 3.2M vertices. Here is the code I am using
>
>  /val inputFile =
> "/user/hive/warehouse/spark_poc.db/window_compare_output_text/000000_0"
>     val conf = new SparkConf().setAppName("ConnectedComponentsTest")
>     val sc = new SparkContext(conf)
>     val graph = GraphLoader.edgeListFile(sc, inputFile, true, 7,
> StorageLevel.MEMORY_AND_DISK, StorageLevel.MEMORY_AND_DISK);
>     graph.cache();
>     val cc = graph.connectedComponents();
>     graph.edges.saveAsTextFile("/user/kakn/output");/
>
> and here is the command:
>
> /spark-submit --class ConnectedComponentsTest --master yarn-cluster
> --num-executors 7 --driver-memory 6g --executor-memory 8g --executor-cores 1
> target/scala-2.10/connectedcomponentstest_2.10-1.0.jar/
>
> It runs for about an hour and then fails with below error. *Isnt Spark
> supposed to spill on disk if the RDDs dont fit into the memory?*
>
> Application application_1418082773407_8587 failed 2 times due to AM
> Container for appattempt_1418082773407_8587_000002 exited with exitCode:
> -104 due to: Container
> [pid=19790,containerID=container_1418082773407_8587_02_000001] is running
> beyond physical memory limits. Current usage: 6.5 GB of 6.5 GB physical
> memory used; 8.9 GB of 13.6 GB virtual memory used. Killing container.
> Dump of the process-tree for container_1418082773407_8587_02_000001 :
> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> |- 19790 19788 19790 19790 (bash) 0 0 110809088 336 /bin/bash -c
> /usr/java/jdk1.7.0_67-cloudera/bin/java -server -Xmx6144m
> -Djava.io.tmpdir=/mnt/DATA1/yarn/nm/usercache/kakn/appcache/application_1418082773407_8587/container_1418082773407_8587_02_000001/tmp
> '-Dspark.executor.memory=8g' '-Dspark.eventLog.enabled=true'
> '-Dspark.yarn.secondary.jars=' '-Dspark.app.name=ConnectedComponentsTest'
> '-Dspark.eventLog.dir=hdfs://<server-name-replaced>:8020/user/spark/applicationHistory'
> '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster
> --class 'ConnectedComponentsTest' --jar
> 'file:/home/kakn01/Spark/SparkSource/target/scala-2.10/connectedcomponentstest_2.10-1.0.jar'
> --executor-memory 8192 --executor-cores 1 --num-executors 7 1>
> /var/log/hadoop-yarn/container/application_1418082773407_8587/container_1418082773407_8587_02_000001/stdout
> 2>
> /var/log/hadoop-yarn/container/application_1418082773407_8587/container_1418082773407_8587_02_000001/stderr
> |- 19794 19790 19790 19790 (java) 205066 9152 9477726208 1707599
> /usr/java/jdk1.7.0_67-cloudera/bin/java -server -Xmx6144m
> -Djava.io.tmpdir=/mnt/DATA1/yarn/nm/usercache/kakn/appcache/application_1418082773407_8587/container_1418082773407_8587_02_000001/tmp
> -Dspark.executor.memory=8g -Dspark.eventLog.enabled=true
> -Dspark.yarn.secondary.jars= -Dspark.app.name=ConnectedComponentsTest
> -Dspark.eventLog.dir=hdfs://<server-name-replaced>:8020/user/spark/applicationHistory
> -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster
> --class ConnectedComponentsTest --jar
> file:/home/kakn01/Spark/SparkSource/target/scala-2.10/connectedcomponentstest_2.10-1.0.jar
> --executor-memory 8192 --executor-cores 1 --num-executors 7
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> .Failing this attempt.. Failing the application.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-beyond-memory-limits-in-ConnectedComponents-tp21139.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message