flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-5064) Checkpoint messages are not scoped to the leader session ID
Date Mon, 23 Jan 2017 12:23:26 GMT

     [ https://issues.apache.org/jira/browse/FLINK-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Robert Metzger updated FLINK-5064:
    Fix Version/s:     (was: 1.2.0)

> Checkpoint messages are not scoped to the leader session ID
> -----------------------------------------------------------
>                 Key: FLINK-5064
>                 URL: https://issues.apache.org/jira/browse/FLINK-5064
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Till Rohrmann
> The checkpoint messages ({{AbstractCheckpointMessage}}) don't implement the {{RequiresLeaderSessionID}}
interface. Thus, they are not scoped to the leadership of a {{JobManager}} and can interfere
with a new leader session.
> The downside of scoping the checkpoint messages to the leader id is that messages might
get filtered out leading to resource leaks because the contained state handle is never discarded.
However, in case of a JM failure one might end up in the same situation if there were some
checkpoint messages in flight.
> In order to mitigate the problem one could change the behaviour such that the {{CheckpointResponder}}
awaits a response back and in case of a negative response or an outstanding response (timeout)
it discards the state handle.

This message was sent by Atlassian JIRA

View raw message