spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rishi Yadav <ri...@infoobjects.com>
Subject Re: Cached RDD
Date Tue, 30 Dec 2014 18:32:46 GMT
Without caching, each action is recomputed. So assuming rdd2 and rdd3
result in separate actions answer is yes.

On Mon, Dec 29, 2014 at 7:53 PM, Corey Nolet <cjnolet@gmail.com> wrote:

> If I have 2 RDDs which depend on the same RDD like the following:
>
> val rdd1 = ...
>
> val rdd2 = rdd1.groupBy()...
>
> val rdd3 = rdd1.groupBy()...
>
>
> If I don't cache rdd1, will it's lineage be calculated twice (one for rdd2
> and one for rdd3)?
>

Mime
View raw message