flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4201) Checkpoints for jobs in non-terminal state (e.g. suspended) get deleted
Date Thu, 21 Jul 2016 14:38:20 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387789#comment-15387789

ASF GitHub Bot commented on FLINK-4201:

Github user uce commented on the issue:

    Thanks for taking a look, Stephan.
    Regarding your question: *Would we interfere with such a setup when removing checkpoints
on "suspend" in "standalone" mode?*:
    Yes, we would interfere, but what you describe is currently **not** possible with Flink
(that is no one can run it like that). The problem is that recovery on the master is tightly
coupled to ZooKeeper (configured via `recovery.mode: ZOOKEEPER`). I really like your idea
and agree that it should be possible to run an HA setup like that. I will open an issue for
it. Do you think it's important to fix this for 1.1 already?
    Regarding the name *standalone*:
    I fully agree. We have a standalone cluster mode and standalone recovery mode. Our standalone
recovery mode (`recovery.mode: STANDALONE`) actually means `NO_RECOVERY`. I think that's what
also made you assume that what you describe is possible, right? 

> Checkpoints for jobs in non-terminal state (e.g. suspended) get deleted
> -----------------------------------------------------------------------
>                 Key: FLINK-4201
>                 URL: https://issues.apache.org/jira/browse/FLINK-4201
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>            Reporter: Stefan Richter
>            Assignee: Ufuk Celebi
>            Priority: Blocker
> For example, when shutting down a Yarn session, according to the logs checkpoints for
jobs that did not terminate are deleted. In the shutdown hook, removeAllCheckpoints is called
and removes checkpoints that should still be kept.

This message was sent by Atlassian JIRA

View raw message