spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Re: Reading Back a Cached RDD
Date Thu, 24 Mar 2016 19:20:47 GMT
Even checkpoint() is maybe not exactly what you want, since if reference
tracking is turned on it will get cleaned up once the original RDD is out
of scope and GC is triggered.
If you want to share persisted RDDs right now one way to do this is sharing
the same spark context (using something like the spark job server or IBM
Spark Kernel).

On Thu, Mar 24, 2016 at 11:28 AM, Nicholas Chammas <
nicholas.chammas@gmail.com> wrote:

> Isn’t persist() only for reusing an RDD within an active application?
> Maybe checkpoint() is what you’re looking for instead?
> ​
>
> On Thu, Mar 24, 2016 at 2:02 PM Afshartous, Nick <nafshartous@turbine.com>
> wrote:
>
>>
>> Hi,
>>
>>
>> After calling RDD.persist(), is then possible to come back later and
>> access the persisted RDD.
>>
>> Let's say for instance coming back and starting a new Spark shell
>> session.  How would one access the persisted RDD in the new shell session ?
>>
>>
>> Thanks,
>>
>> --
>>
>>    Nick
>>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Mime
View raw message