spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Corey Nolet <cjno...@gmail.com>
Subject Cached RDD
Date Tue, 30 Dec 2014 03:53:42 GMT
If I have 2 RDDs which depend on the same RDD like the following:

val rdd1 = ...

val rdd2 = rdd1.groupBy()...

val rdd3 = rdd1.groupBy()...


If I don't cache rdd1, will it's lineage be calculated twice (one for rdd2
and one for rdd3)?

Mime
View raw message