spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Hubregtsen <>
Subject Re: Spilling when not expected
Date Fri, 13 Mar 2015 16:33:38 GMT
I use the spark-submit script and the config files in a conf directory. I
see the memory settings reflected in the stdout, as well as in the webUI.
(it prints all variables from spark-default.conf, and metions I have 540GB
free memory available when trying to store a broadcast variable or RDD). I
also run "ps -aux | grep java | grep th", which show me that I called java
with "-Xms1000g -Xmx1000g"

I also tested if these numbers are realistic for the J9 JVM. Outside of
Spark, when setting just the initial heapsize (Xms), it gives an error, but
if I also define the maximum option with it (Xmx), it seems to us that it
is accepting it. Also, in IBM's J9 health center, I see it reserve the
900g, and use up to 68g.



On 13 March 2015 at 02:05, Reynold Xin <> wrote:

> How did you run the Spark command? Maybe the memory setting didn't
> actually apply? How much memory does the web ui say is available?
> BTW - I don't think any JVM can actually handle 700G heap ... (maybe Zing).
> On Thu, Mar 12, 2015 at 4:09 PM, Tom Hubregtsen <>
> wrote:
>> Hi all,
>> I'm running the teraSort benchmark with a relative small input set: 5GB.
>> During profiling, I can see I am using a total of 68GB. I've got a
>> terabyte
>> of memory in my system, and set
>> spark.executor.memory 900g
>> spark.driver.memory 900g
>> I use the default for
>> spark.shuffle.memoryFraction
>> I believe that I now have 0.2*900=180GB for shuffle and 0.6*900=540GB for
>> storage.
>> I noticed a lot of variation in runtime (under the same load), and tracked
>> this down to this function in
>> core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
>>   private def spillToPartitionFiles(collection:
>> SizeTrackingPairCollection[(Int, K), C]): Unit = {
>>     spillToPartitionFiles(collection.iterator)
>>   }
>> In a slow run, it would loop through this function 12000 times, in a fast
>> run only 700 times, even though the settings in both runs are the same and
>> there are no other users on the system. When I look at the function
>> calling
>> this (insertAll, also in ExternalSorter), I see that spillToPartitionFiles
>> is only called 700 times in both fast and slow runs, meaning that the
>> function recursively calls itself very often. Because of the function
>> name,
>> I assume the system is spilling to disk. As I have sufficient memory, I
>> assume that I forgot to set a certain memory setting. Anybody any idea
>> which
>> other setting I have to set, in order to not spill data in this scenario?
>> Thanks,
>> Tom
>> --
>> View this message in context:
>> Sent from the Apache Spark Developers List mailing list archive at
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message