spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shahab <>
Subject How to enforce RDD to be cached?
Date Wed, 03 Dec 2014 09:52:00 GMT

I noticed that rdd.cache() is not happening immediately rather due to lazy
feature of Spark, it is happening just at the moment  you perform some
map/reduce actions. Is this true?

If this is the case, how can I enforce Spark to cache immediately at its
cache() statement? I need this to perform some benchmarking and I need to
separate rdd caching and rdd transformation/action processing time.


View raw message