You can open the application UI (that runs on 4040) and see how much memory is being allocated to the executor tabs and from the environments tab.

Are you deploying against yarn or standalone mode? In yarn try setting the shell variables SPARK_EXECUTOR_MEMORY=2G in standalone try and set SPARK_WORKER_MEMORY=2G.


TL;DR - a spark SQL job fails with an OOM (Out of heap space) error.  If given "--executor-memory" values, it won't even start.  Even (!) if the values given ARE THE SAME AS THE DEFAULT.

Without --executor-memory:

14/10/16 17:14:58 INFO TaskSetManager: Serialized task 1.0:64 as 14710 bytes in 1 ms
14/10/16 17:14:58 WARN TaskSetManager: Lost TID 26 (task 1.0:25)
14/10/16 17:14:58 WARN TaskSetManager: Loss was due to java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: Java heap space
        at parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(
        at parquet.hadoop.ParquetFileReader.readNextRowGroup(

USING --executor-memory (WITH ANY VALUE), even "1G" which is the default:

Parsed arguments:
  master                  spark://<redacted>:7077
  deployMode              null
  executorMemory          1G

System properties:
spark.executor.memory -> 1G
spark.eventLog.enabled -> true

14/10/16 17:14:23 INFO TaskSchedulerImpl: Adding task set 1.0 with 678 tasks
14/10/16 17:14:38 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Spark 1.0.0.  Is this a bug?

