spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dachuan <hdc1...@gmail.com>
Subject a question about RDD.checkpoint()
Date Sat, 09 Nov 2013 03:01:54 GMT
Hello,

I have a quick question about RDD.checkpoint().

If the user calls RDD.checkpoint() and after the job finishes, the Spark
would call RDD.doCheckpoint() to do the real physical checkpointing, that
is to say, dump this RDD's partitions into HDFS.

Does this mean that all its parents RDD scala objects and RDD's data (which
is managed by BlockManager) will be garbage collected?

And could you please point me to the relevant source code region, if
possible?

thanks,
dachuan.

-- 
Dachuan Huang
Cellphone: 614-390-7234
2015 Neil Avenue
Ohio State University
Columbus, Ohio
U.S.A.
43210

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message