in  ALS, I guess all the iteration’s rdds are referenced by its next iteration’s rdd, so all the shuffle data will not be deleted until the als job finished…

I guess checkpoint could solve my problem, do you know checkpoint?

在 2015年3月3日,下午4:18,nitin [via Apache Spark User List] <[hidden email]> 写道:

Shuffle write will be cleaned if it is not referenced by any object directly/indirectly. There is a garbage collector written inside spark which periodically checks for weak references to RDDs/shuffle write/broadcast and deletes them.


If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-clean-shuffle-write-each-iteration-tp21886p21889.html
To unsubscribe from how to clean shuffle write each iteration, click here.
NAML



View this message in context: Re: how to clean shuffle write each iteration
Sent from the Apache Spark User List mailing list archive at Nabble.com.