Hey David,

My recommendation is to use the fact that the PipelineExecution object returned by Pipeline.runAsync() implements Future<PipelineResult>, so you can add a listener to it that cleans up your checkpoint directories at the end of the job depending on whether it succeeds or fails.


On Wed, Apr 1, 2015 at 5:52 AM, David Ortiz <dpo5003@gmail.com> wrote:
Hey everyone,

     Is there a setting I can tweak in my MRPipeline so when it does its cleanup after a run it cleans up the checkpoint dirs I create as well, or would I need to add some hdfs code at the end in the case of a successful run?

     Dave Ortiz

Director of Data Science
Twitter: @josh_wills