spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Isabelle Phan <>
Subject Delete checkpointed data for a single dataset?
Date Wed, 23 Oct 2019 18:43:28 GMT

In a non streaming application, I am using the checkpoint feature to
truncate the lineage of complex datasets. At the end of the job, the
checkpointed data, which is stored in HDFS, is deleted.
I am looking for a way to delete the unused checkpointed data earlier than
the end of the job. If I know that one dataset won't be used anymore, is
there a way to delete its checkpointed data in the middle of the

Thank you,


View raw message