flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5892) Recover job state at the granularity of operator
Date Tue, 20 Jun 2017 07:00:01 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16055241#comment-16055241
] 

ASF GitHub Bot commented on FLINK-5892:
---------------------------------------

Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3844#discussion_r122894136
  
    --- Diff: flink-tests/src/test/java/org/apache/flink/test/state/operator/restore/keyed/KeyedJob.java
---
    @@ -98,10 +98,9 @@ public static void main(String[] args) throws Exception {
     			.map(new StatefulStringStoringMap(mode, "first"))
     			.setParallelism(4);
     
    -		// TODO: re-enable this when generating the actual 1.2 savepoint
    -		//if (mode == ExecutionMode.MIGRATE || mode == ExecutionMode.RESTORE) {
    -		map.uid("first");
    -		//}
    +		if (mode == ExecutionMode.MIGRATE || mode == ExecutionMode.RESTORE) {
    --- End diff --
    
    You are completely right, the commit message/PR description isn't sufficient to explain
what this PR changes. In fact, it took me a bit to remember that as well. I'll adjust the
commit message later on.
    
    So this PR is pretty subtle, since the changes to the code aren't the interesting part,
but the change to the `complexKeyed-flink1.2/_metadata` file is. This file is supposed to
be a 1.2 savepoint to verify the restore behavior from them in 1.3. But this file is not a
1.2 savepoint, because at the time of merging the restoration of keyed 1.2 state was broken,
In the meantime we used a 1.3 savepoint instead.
    
    The main thing this PR does is replace this 1.3 savepoint with an actual 1.2 savepoint.
    
    The second change is related to the uid's. In 1.2, it is not possible to assign UIDs to
chained operators. As "first" and "second" are both chained to the window function we are
not allowed to call `map.uid("...")` when generating the 1.2 savepoint (! (MIGRATE || RESTORE)).
However, in 1.3 it is possible and in fact mandatory to assign UIDs.
    
    Does that clear things up?


> Recover job state at the granularity of operator
> ------------------------------------------------
>
>                 Key: FLINK-5892
>                 URL: https://issues.apache.org/jira/browse/FLINK-5892
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>            Reporter: Guowei Ma
>            Assignee: Guowei Ma
>             Fix For: 1.3.0
>
>
> JobGraph has no `Operator` info so `ExecutionGraph` can only recovery at the granularity
of task.
> This leads to the result that the operator of the job may not recover the state from
a save point even if the save point has the state of operator. 
>  https://docs.google.com/document/d/19suTRF0nh7pRgeMnIEIdJ2Fq-CcNVt5_hR3cAoICf7Q/edit#.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message