spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd <>
Subject What's the benifit of RDD checkpoint against RDD save
Date Thu, 24 Mar 2016 02:14:07 GMT

I have a long computing chain, when I get the last RDD after a series of transformation. I
have two choices to do with this last RDD

1. Call checkpoint on RDD to materialize it to disk
2. Call RDD.saveXXX to save it to HDFS, and read it back for further processing

I would ask which choice is better? It looks to me that is not much difference between the
two choices.

View raw message