spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Mayi <>
Subject Re: spark-local dir running out of space during long ALS run
Date Mon, 16 Feb 2015 08:27:20 GMT
spark.cleaner.ttl is not the right way - seems to be really designed for streaming. although
it keeps the disk usage under control it also causes loss of rdds and broadcasts that are
required later leading to crash.
is there any other way?thanks,Antony. 

     On Sunday, 15 February 2015, 21:42, Antony Mayi <> wrote:

 spark.cleaner.ttl ? 

     On Sunday, 15 February 2015, 18:23, Antony Mayi <> wrote:

I am running bigger ALS on spark 1.2.0 on yarn (cdh 5.3.0) - ALS is using about 3 billions
of ratings and I am doing several trainImplicit() runs in loop within one spark session. I
have four node cluster with 3TB disk space on each. before starting the job there is less
then 8% of the disk space used. while the ALS is running I can see the disk usage rapidly
growing mainly because of files being stored under yarn/local/usercache/user/appcache/application_XXX_YYY/spark-local-ZZZ-AAA.
after about 10 hours the disk usage hits 90% and yarn kills the particular containers.
am I missing doing some cleanup somewhere while looping over the several trainImplicit() calls?
taking 4*3TB of disk space seems immense.
thanks for any help,Antony. 


View raw message