flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Retrieving name of last external checkpoint directory
Date Tue, 20 Feb 2018 16:17:40 GMT
Hi,

I think there is currently no easy way of doing this. Things that come to mind are:
 - looking at the JM log
 - polling the JM REST interface for completed externalised checkpoints

The good news is that Flink 1.5 will rework how externalised checkpoints work a bit: basically,
all checkpoints can now be considered externalised and the metadata will be stored in the
root directory of the checkpoint, not in one global directory for all jobs. This way, the
metadata for externalised checkpoints resides in the checkpoint directory of each job and
it should be reasonably simple to restore from that.

Best,
Aljoscha

> On 15. Feb 2018, at 10:55, Dawid Wysakowicz <wysakowicz.dawid@gmail.com> wrote:
> 
> Hi,
> 
> We are running few jobs on yarn and in case of some failure (that the job could not recover
from on its own) we want to use last successful external checkpoint to restore the job from
manually. The problem is that the
> ${state.checkpoints.dir} contains checkpoint directories for all jobs that we are running.
How can we find out the last successful external checkpoint for some particular job? Will
be grateful for any pointers.
> 
> Regards,
> Dawid


Mime
View raw message