spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Isabelle Phan <nlip...@gmail.com>
Subject Delete checkpointed data for a single dataset?
Date Wed, 23 Oct 2019 18:43:28 GMT
Hello,

In a non streaming application, I am using the checkpoint feature to
truncate the lineage of complex datasets. At the end of the job, the
checkpointed data, which is stored in HDFS, is deleted.
I am looking for a way to delete the unused checkpointed data earlier than
the end of the job. If I know that one dataset won't be used anymore, is
there a way to delete its checkpointed data in the middle of the
application?

Thank you,

Isabelle

Mime
View raw message