flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8360) Implement task-local state recovery
Date Mon, 08 Jan 2018 12:53:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316251#comment-16316251

ASF GitHub Bot commented on FLINK-8360:

Github user pnowojski commented on a diff in the pull request:

    --- Diff: flink-contrib/flink-storm/src/test/java/org/apache/flink/storm/wrappers/BoltWrapperTest.java
    @@ -375,11 +376,13 @@ public void declareOutputFields(OutputFieldsDeclarer declarer) {
     		when(env.getMetricGroup()).thenReturn(new UnregisteredTaskMetricsGroup());
     		when(env.getTaskManagerInfo()).thenReturn(new TestingTaskManagerRuntimeInfo());
    +		final CloseableRegistry closeableRegistry = new CloseableRegistry();
     		StreamTask<?, ?> mockTask = mock(StreamTask.class);
     		when(mockTask.getCheckpointLock()).thenReturn(new Object());
     		when(mockTask.getConfiguration()).thenReturn(new StreamConfig(new Configuration()));
    +		when(mockTask.getCancelables()).thenReturn(closeableRegistry);
    --- End diff --
    I don't like the idea of postponing such things. I really would like it to be done in
this pr (as a separate commit, ideally at the bottom to avoid modifying same lines of code
twice, but as a last resort it could also be the last commit). Otherwise we will forget about
it and while this only one added line it increases our technological debt and makes our code
base a tiny bit worse then it used to be before - both of those things are quite dangerous.

> Implement task-local state recovery
> -----------------------------------
>                 Key: FLINK-8360
>                 URL: https://issues.apache.org/jira/browse/FLINK-8360
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>             Fix For: 1.5.0
> This issue tracks the development of recovery from task-local state. The main idea is
to have a secondary, local copy of the checkpointed state, while there is still a primary
copy in DFS that we report to the checkpoint coordinator.
> Recovery can attempt to restore from the secondary local copy, if available, to save
network bandwidth. This requires that the assignment from tasks to slots is as sticky is possible.
> For starters, we will implement this feature for all managed keyed states and can easily
enhance it to all other state types (e.g. operator state) later.

This message was sent by Atlassian JIRA

View raw message