spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Madhu <>
Subject Re: Detecting configuration problems
Date Tue, 08 Sep 2015 12:53:57 GMT
Thanks Akhil!

I suspect the root cause of the shuffle OOM I was seeing (and probably many
that users might see) is due to individual partitions on the reduce side not
fitting in memory. As a guideline, I was thinking of something like "be sure
that your largest partitions occupy no more then 1% of executor memory" or
something to that effect. I can add that documentation to the tuning page if
someone can suggest the the best wording and numbers. I can also add a
simple Spark shell example to estimate largest partition size to determine
executor memory and number of partitions.

One more question: I'm trying to get my head around the shuffle code. I see
ShuffleManager, but that seems to be on the reduce side. Where is the code
driving the map side writes and reduce reads? I think it is possible to add
up reduce side volume for a key (they are byte reads at some point) and
raise an alarm if it's getting too high. Even a warning on the console would
be better than a catastrophic OOM.

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message