To be clear on what your configuration will do:

- SPARK_DAEMON_MEMORY=8g will make your standalone master and worker schedulers have a lot of memory. These do not impact the actual amount of useful memory given to executors or your driver, however, so you probably don't need to set this.
- SPARK_WORKER_MEMORY=8g allows each worker to provide up to 8g worth of executors. In itself, this does not actually give executors more memory, just allows them to get more. This is a necessary setting.

- *_JAVA_OPTS should not be used to set memory parameters, as they may or may not override their *_MEMORY counterparts.

The two things you are not configuring are the amount of memory for your driver (for a 0.8.1 spark-shell, you must use SPARK_MEM) and the amount of memory given to each executor (spark.executor.memory). By default, Spark executors are only 512MB in size, so you will probably want to increase this up to the value of SPARK_WORKER_MEMORY. This will provide you with 1 executor per worker that uses all available memory, which is probably what you want for testing purposes (it is less ideal for sharing a cluster).

In case the distinction between workers/masters (collectively "daemons"), executors, and drivers is not clear to you, please check out the corresponding documentation on Spark clusters: https://spark.incubator.apache.org/docs/0.8.1/cluster-overview.html


On Mon, Mar 24, 2014 at 12:24 AM, Sai Prasanna <ansaiprasanna@gmail.com> wrote:

Hi All !! I am getting the following error in interactive spark-shell [0.8.1]


org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed more than 0 times; aborting job java.lang.OutOfMemoryError: GC overhead limit exceeded


But i had set the following in the spark.env.sh and hadoop-env.sh

export SPARK_DEAMON_MEMORY=8g
export SPARK_WORKER_MEMORY=8g
export SPARK_DEAMON_JAVA_OPTS="-Xms8g -Xmx8g"
export SPARK_JAVA_OPTS="-Xms8g -Xmx8g"



export HADOOP_HEAPSIZE=4000

Any suggestions ??

--
Sai Prasanna. AN
II M.Tech (CS), SSSIHL