spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 林武康 <vboylin1...@gmail.com>
Subject 答复: What's the lifecycle of an rdd? Can I control it?
Date Thu, 20 Mar 2014 03:30:17 GMT
Thank you, everybody. Nice to know 😊

-----原始邮件-----
发件人: "Nicholas Chammas" <nicholas.chammas@gmail.com>
发送时间: ‎2014/‎3/‎20 10:23
收件人: "user" <user@spark.apache.org>
主题: Re: What's the lifecycle of an rdd? Can I control it?

Related question: 


If I keep creating new RDDs and cache()-ing them, does Spark automatically unpersist the least
recently used RDD when it runs out of memory? Or is an explicit unpersist the only way to
get rid of an RDD (barring the PR Tathagata mentioned)?


Also, does unpersist()-ing an RDD immediately free up space, or just allow that space to be
reclaimed when needed?



On Wed, Mar 19, 2014 at 7:01 PM, Tathagata Das <tathagata.das1565@gmail.com> wrote:

Just a head's up, there is an active pull requeust that will automatically unpersist RDDs
that are not in reference/scope from the application any more. 


TD



On Wed, Mar 19, 2014 at 6:58 PM, hequn cheng <chenghequn@gmail.com> wrote:

persist and unpersist.
unpersist:Mark the RDD as non-persistent, and remove all blocks for it from memory and disk



2014-03-19 16:40 GMT+08:00 林武康 <vboylin1987@gmail.com>:


Hi, can any one tell me about the lifecycle of an rdd? I search through the official website
and still can't figure it out. Can I use an rdd in some stages and destroy it in order to
release memory because that no stages ahead will use this rdd any more. Is it possible?

Thanks!

Sincerely 
Lin wukang
Mime
View raw message