spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yncxcw <ynjassionc...@gmail.com>
Subject Re: Data loss in spark job
Date Wed, 28 Feb 2018 06:20:31 GMT
hi, 

Please check if your os supports memory overcommit. I doubted this caused by
your os bans the memory overcommitment, and the os kills the process when
memory overcommitment is detected (the spark executor is chosen to kill).
This is why you receive sigterm, and executor failed with the signal and
lost all your data.

Please check /proc/sys/vm/overcommit_memory and set it accordingly:

/proc/sys/vm/overcommit_memory
This switch knows 3 different settings:

0: The Linux kernel is free to overcommit memory (this is the default), a
heuristic algorithm is applied to figure out if enough memory is available.
1: The Linux kernel will always overcommit memory, and never check if enough
memory is available. This increases the risk of out-of-memory situations,
but also improves memory-intensive workloads.
2: The Linux kernel will not overcommit memory, and only allocate as much
memory as defined in overcommit_ratio.

Another way is to just decrease the JVM heap size by setting a small -Xmx to
decrease the amount of memory the JVM is requesting the OS to reserve.

Thanks!

Wei



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message