spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagat Singh <jagatsi...@gmail.com>
Subject Re: Why RDD is not cached?
Date Tue, 28 Oct 2014 07:22:19 GMT
What setting you are using for

persist() or cache()

http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence

On Tue, Oct 28, 2014 at 6:18 PM, shahab <shahab.mokari@gmail.com> wrote:

> Hi,
>
> I have a standalone spark , where the executor is set to have 6.3 G memory
> , as I am using two workers so in total there 12.6 G memory and 4 cores.
>
> I am trying to cache a RDD with approximate size of 3.2 G, but apparently
> it is not cached as neither I can see  "  BlockManagerMasterActor: Added
> rdd_XX in memory " nor  the performance of running the tasks is improved
>
> But, why it is not cached when there is enough memory storage?
> I tried with smaller RDDs. 1 or 2 G and it works, at least I could see "BlockManagerMasterActor:
> Added rdd_0_1 in memory" and improvement in results.
>
> Any idea what I am missing in my settings, or... ?
>
> thanks,
> /Shahab
>

Mime
View raw message