spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yong Zhang <java8...@hotmail.com>
Subject Re: Executor Lost error
Date Tue, 04 Oct 2016 13:39:53 GMT
You should check your executor log to identify the reason. My guess is that the executor is
dead due to OOM.


If it is the reason, then you need to tune your executor memory setting, or more important,
your partitions count, to make sure you have enough memory to handle correct size of partition
data.


Yong


________________________________
From: Punit Naik <naik.punit44@gmail.com>
Sent: Monday, October 3, 2016 8:07 PM
To: user
Subject: Executor Lost error

Hi All

I am trying to run a program for a large dataset (~ 1TB). I have already tested the code for
low size of data and it works fine. But what I noticed is that he job fails if the size of
input is large. It was giving me errors regarding akkka actor disassociation which I fixed
by increasing the timeouts.
But now I am getting errors like "execuyor lost" and "executor lost failure" which I can't
seem to figure out. These are my current set of configs:

--conf "spark.network.timeout=30000"
--conf "spark.core.connection.ack.wait.timeout=30000"
--conf "spark.akka.timeout=30000"
--conf "spark.akka.askTimeout=30000"
--conf "spark.akka.frameSize=1000"
--conf "spark.storage.blockManagerSlaveTimeoutMs=600000"
--conf "spark.network.timeout=600"
--conf "spark.shuffle.memoryFraction=0.8"
--conf "spark.driver.maxResultSize=16g"
--conf "spark.driver.cores=10"
--conf "spark.driver.memory=10g"

Can anyone tell me any more configs to circumvent this "executor lost" and "executor lost
failure" error?

--
Thank You

Regards

Punit Naik

Mime
View raw message