spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject spark sql jobs heap memory
Date Wed, 23 Nov 2016 18:53:11 GMT
we are testing Dataset/Dataframe jobs instead of RDD jobs. one thing we
keep running into is containers getting killed by yarn. i realize this has
to do with off-heap memory, and the suggestion is to increase
spark.yarn.executor.memoryOverhead.

at times our memoryOverhead is as large as the executor memory (say 4G and
4G).

why is Dataset/Dataframe using so much off heap memory?

we havent changed spark.memory.offHeap.enabled which defaults to false.
should we enable that to get a better handle on this?

Mime
View raw message