Hey David,
My recommendation is to use the fact that the PipelineExecution object
returned by Pipeline.runAsync() implements Future<PipelineResult>, so you
can add a listener to it that cleans up your checkpoint directories at the
end of the job depending on whether it succeeds or fails.
Josh
On Wed, Apr 1, 2015 at 5:52 AM, David Ortiz <dpo5003@gmail.com> wrote:
> Hey everyone,
>
> Is there a setting I can tweak in my MRPipeline so when it does its
> cleanup after a run it cleans up the checkpoint dirs I create as well, or
> would I need to add some hdfs code at the end in the case of a successful
> run?
>
> Thanks,
> Dave Ortiz
>
--
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>
|