[ https://issues.apache.org/jira/browse/SPARK-22184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188282#comment-16188282 ] Apache Spark commented on SPARK-22184: -------------------------------------- User 'szhem' has created a pull request for this issue: https://github.com/apache/spark/pull/19410 > GraphX fails in case of insufficient memory and checkpoints enabled > ------------------------------------------------------------------- > > Key: SPARK-22184 > URL: https://issues.apache.org/jira/browse/SPARK-22184 > Project: Spark > Issue Type: Bug > Components: GraphX > Affects Versions: 2.2.0 > Environment: spark 2.2.0 > scala 2.11 > Reporter: Sergey Zhemzhitsky > > GraphX fails with FileNotFoundException in case of insufficient memory when checkpoints are enabled. > Here is the stacktrace > {code} > Job aborted due to stage failure: Task creation failed: java.io.FileNotFoundException: File file:/tmp/spark-90119695-a126-47b5-b047-d656fee10c17/9b16e2a9-6c80-45eb-8736-bbb6eb840146/rdd-28/part-00000 does not exist > java.io.FileNotFoundException: File file:/tmp/spark-90119695-a126-47b5-b047-d656fee10c17/9b16e2a9-6c80-45eb-8736-bbb6eb840146/rdd-28/part-00000 does not exist > at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:539) > at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:752) > at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:529) > at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) > at org.apache.spark.rdd.ReliableCheckpointRDD.getPreferredLocations(ReliableCheckpointRDD.scala:89) > at org.apache.spark.rdd.RDD$$anonfun$preferredLocations$1.apply(RDD.scala:274) > at org.apache.spark.rdd.RDD$$anonfun$preferredLocations$1.apply(RDD.scala:274) > at scala.Option.map(Option.scala:146) > at org.apache.spark.rdd.RDD.preferredLocations(RDD.scala:274) > at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal(DAGScheduler.scala:1697) > ... > {code} > As GraphX uses cached RDDs intensively, the issue is only reproducible when previously cached and checkpointed Vertex and Edge RDDs are evicted from memory and forced to be read from disk. > For testing purposes the following parameters may be set to emulate low memory environment > {code} > val sparkConf = new SparkConf() > .set("spark.graphx.pregel.checkpointInterval", "2") > // set testing memory to evict cached RDDs from it and force > // reading checkpointed RDDs from disk > .set("spark.testing.reservedMemory", "128") > .set("spark.testing.memory", "256") > {code} > This issue also includes SPARK-22150 and cannot be fixed until SPARK-22150 is fixed too. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org