From Wei Tan <>
Subject Re: long GC pause during file.cache()
Date Mon, 16 Jun 2014 14:46:14 GMT
Thanks you all for advice including (1) using CMS GC (2) use multiple 
worker instance and (3) use Tachyon.

I will try (1) and (2) first and report back what I found.

I will also try JDK 7 with G1 GC.

Best regards,

Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center

From:   Aaron Davidson <>
Date:   06/15/2014 09:06 PM
Subject:        Re: long GC pause during file.cache()

Note also that Java does not work well with very large JVMs due to this 
exact issue. There are two commonly used workarounds:

1) Spawn multiple (smaller) executors on the same machine. This can be 
done by creating multiple Workers (via SPARK_WORKER_INSTANCES in 
standalone mode[1]).
2) Use Tachyon for off-heap caching of RDDs, allowing Spark executors to 
be smaller and avoid GC pauses

[1] See standalone documentation here:

On Sun, Jun 15, 2014 at 3:50 PM, Nan Zhu <> wrote:
Yes, I think in the, it is listed in the comments 
(didn’t check….) 


Nan Zhu

On Sunday, June 15, 2014 at 5:21 PM, Surendranauth Hiraman wrote:
Is SPARK_DAEMON_JAVA_OPTS valid in 1.0.0?

On Sun, Jun 15, 2014 at 4:59 PM, Nan Zhu <> wrote:
SPARK_JAVA_OPTS is deprecated in 1.0, though it works fine if you 
don’t mind the WARNING in the logs

you can set spark.executor.extraJavaOpts in your SparkConf obj


Nan Zhu

On Sunday, June 15, 2014 at 12:13 PM, Hao Wang wrote:
Hi, Wei

You may try to set JVM opts in as follow to prevent or 
mitigate GC pause:

export SPARK_JAVA_OPTS="-XX:-UseGCOverheadLimit -XX:+UseConcMarkSweepGC 
-Xmx2g -XX:MaxPermSize=256m"

There are more options you could add, please just Google :) 

On Sun, Jun 15, 2014 at 10:24 AM, Wei Tan <> wrote:

  I have a single node (192G RAM) stand-alone spark, with memory 
configuration like this in 


 In spark-shell I have a program like this: 

val file = sc.textFile("/localpath") //file size is 40G 

val output = => extract something from line) 

output.saveAsTextFile (...) 

When I run this program again and again, or keep trying file.unpersist() 
--> file.cache() --> output.saveAsTextFile(), the run time varies a lot, 
from 1 min to 3 min to 50+ min. Whenever the run-time is more than 1 min, 
from the stage monitoring GUI I observe big GC pause (some can be 10+ 
min). Of course when run-time is "normal", say ~1 min, no significant GC 
is observed. The behavior seems somewhat random. 

Is there any JVM tuning I should do to prevent this long GC pause from 

I used java-1.6.0-openjdk.x86_64, and my spark-shell process is something 
like this: 

root     10994  1.7  0.6 196378000 1361496 pts/51 Sl+ 22:06   0:12 
/usr/lib/jvm/java-1.6.0-openjdk.x86_64/bin/java -cp 

-XX:MaxPermSize=128m -Djava.library.path= -Xms180g -Xmx180g 
org.apache.spark.deploy.SparkSubmit spark-shell --class 

Best regards, 

Wei Tan, PhD 
Research Staff Member 
IBM T. J. Watson Research Center

Accelerating Machine Learning

View raw message