spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Davidson <>
Subject Re: "Spilling in-memory..." messages in log even with MEMORY_ONLY
Date Sun, 27 Jul 2014 17:54:41 GMT
I see. There should not be a significant algorithmic difference between
those two cases, as far as I can think, but there is a good bit of
"local-mode-only" logic in Spark.

One typical problem we see on large-heap, many-core JVMs, though, is much
more time spent in garbage collection. I'm not sure how oprofile gathers
its statistics, but it's possible the stop-the-world pauses just appear as
pausing inside regular methods. You could see if this is happening by
adding "-XX:+PrintGCDetails" to spark.executor.extraJavaOptions (in
spark-defaults.conf) and --driver-java-options (as a command-line
argument), and then examining the stdout logs.

On Sun, Jul 27, 2014 at 10:29 AM, lokesh.gidra <>

> I am comparing the total time spent in finishing the job. And What I am
> comparing, to be precise, is on a 48-core machine. I am comparing the
> performance of local[48] vs. standalone mode with 8 nodes of 6 cores each
> (totalling 48 cores) on localhost. In this comparison, the standalone mode
> outperforms local[48] substantially. When I did some troublshooting using
> oprofile, I found that local[48] was spending much more time in
> writeObject0
> as compared to standalone mode.
> I am running the PageRank example provided in the package.
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at

View raw message