Welcome to Spark. What's more fun is that setting controls memory on the executors but if you want to set memory limit on the driver you need to configure it as a parameter of the spark-submit script. You also set num-executors and executor-cores on the spark submit call.

See both the Spark tuning guide and the Spark configuration page for more discussion of stuff like this.

W.r.t. The spark memory option, my understanding is that parameter has been deprecated (the SPARK_EXE_MEM) and the documentation is probably stale. Good starting point for cleanup would probably be to update that :-).
On Thu, Jan 1, 2015 at 1:45 AM Kevin Burton <burton@spinn3r.com> wrote:
wow. Just figured it out:

        conf.set( "spark.executor.memory", "2g");

I have to set it in the Job… that’s really counter intuitive.  Especially because the documentation in spark-env.sh says the exact opposite.

What’s the resolution here.  This seems like a mess. I’d propose a solution to clean it up but I don’t know where to begin.

On Wed, Dec 31, 2014 at 10:35 PM, Kevin Burton <burton@spinn3r.com> wrote:
This is really weird and I’m surprised no one has found this issue yet.

I’ve spent about an hour or more trying to debug this :-(

My spark install is ignoring ALL my memory settings.  And of course my job is running out of memory.

The default is 512MB so pretty darn small.  

The worker and master start up and both use 512M

This alone is very weird and poor documentation IMO because:

 "SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2g)”

… so if it’s giving it to executors, AKA the memory executors run with, then it should be SPARK_EXECUTOR_MEMORY…

… and the worker actually uses SPARK_DAEMON memory.  

but actually I’m right.  It IS SPARK_EXECUTOR_MEMORY… according to bin/spark-class 

… but, that’s not actually being used :-( 

that setting is just flat out begin ignored and it’s just using 512MB.  So all my jobs fail.

… and I write an ‘echo’ so I could trace the spark-class script to see what the daemons are actually being run with and spark-class wasn’t being called with and nothing is logged for the coarse grained executor.  I guess it’s just inheriting the JVM opts from it’s parent and Java is launching the process directly?

This is a nightmare :( 


Founder/CEO Spinn3r.com
Location: San Francisco, CA
… or check out my Google+ profile


Founder/CEO Spinn3r.com
Location: San Francisco, CA
… or check out my Google+ profile