spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sung Hwan Chung <coded...@cs.stanford.edu>
Subject Spark job (not Spark streaming) doesn't delete un-needed checkpoints.
Date Fri, 10 Oct 2014 16:15:11 GMT
Un-needed checkpoints are not getting automatically deleted in my
application.

I.e. the lineage looks something like this and checkpoints simply
accumulate in a temporary directory (every lineage point, however, does zip
with a globally permanent):

PermanentRDD:    Global zips with all the intermediate ones

Intermediate RDDs: A--->B--->C---->D---->E---->F---->----->G
                                |                          |
           |
                              checkpoint          checkpoint
 checkpoint

Older intermediate RDDs never get used.

Mime
View raw message