flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1953) Rework Checkpoint Coordinator
Date Tue, 05 May 2015 07:46:05 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528078#comment-14528078
] 

ASF GitHub Bot commented on FLINK-1953:
---------------------------------------

Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/651#issuecomment-98984867
  
    Great!
    I'll integrate it with my new Kafka Source and test everything on a cluster.


> Rework Checkpoint Coordinator
> -----------------------------
>
>                 Key: FLINK-1953
>                 URL: https://issues.apache.org/jira/browse/FLINK-1953
>             Project: Flink
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 0.9
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 0.9
>
>
> The checkpoint coordinator currently contains no tests and is vulnerable to a variety
of situations. In particular, I propose to add:
>  - Better configurability which tasks receive the trigger checkpoint messages, which
tasks need to acknowledge the checkpoint, and which tasks need to receive confirmation messages.
>  - checkpoint timeouts, such that incomplete checkpoints are guaranteed to be cleaned
up after a while, regardless of successful checkpoints
>  - better sanity checking of messages and fields, to properly handle/ignore messages
for old/expired checkpoints, or invalidly routed messages
>  - Better handling of checkpoint attempts at points where the execution has just failed
is is currently being canceled.
>  - Add a good set of tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message