spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akshat Aranya <>
Subject One pass compute() to produce multiple RDDs
Date Thu, 09 Oct 2014 21:55:25 GMT

Is there a good way to materialize derivate RDDs from say, a HadoopRDD
while reading in the data only once.  One way to do so would be to cache
the HadoopRDD and then create derivative RDDs, but that would require
enough RAM to cache the HadoopRDD which is not an option in my case.


View raw message