spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Fregly <ch...@fregly.com>
Subject Re: Reconnect to an application/RDD
Date Mon, 30 Jun 2014 03:38:59 GMT
Tachyon is another option - this is the "off heap" StorageLevel specified
when persisting RDDs:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.storage.StorageLevel

or just use HDFS.  this requires subsequent Applications/SparkContext's to
reload the data from disk, of course.


On Tue, Jun 3, 2014 at 6:58 AM, Gerard Maas <gerard.maas@gmail.com> wrote:

> I don't think that's supported by default as when the standalone context
> will close, the related RDDs will be GC'ed
>
> You should explore Spark-Job Server, which allows to cache RDDs by name
> and reuse them within  a  context.
>
> https://github.com/ooyala/spark-jobserver
>
> -kr, Gerard.
>
>
> On Tue, Jun 3, 2014 at 3:45 PM, Oleg Proudnikov <oleg.proudnikov@gmail.com
> > wrote:
>
>> HI All,
>>
>> Is it possible to run a standalone app that would compute and
>> persist/cache an RDD and then run other standalone apps that would gain
>> access to that RDD?
>>
>> --
>> Thank you,
>> Oleg
>>
>>
>

Mime
View raw message