flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7595) Removing stateless task from task chain breaks savepoint restore
Date Wed, 29 Nov 2017 18:18:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16271279#comment-16271279

ASF GitHub Bot commented on FLINK-7595:

GitHub user zentol opened a pull request:


     [FLINK-7595] [Savepoints] Allow removing stateless operators 

    This is a fixed version of #4651 for 1.4. It adds some checkstyle fixes and modifies the
SavepointLoaderTest to actually have a stateful task. The test is currently only passing due
to the broken behavior.
    ## What is the purpose of the change
    This PR reverts a regression where stateless operators could no longer be removed from
a job when loading a savepoint without setting the `--allowNonRestoredState` flag. The check
now explicitly checks whether the state of an operator, that could not be mapped to the new
program, is empty.
    ## Brief change log
    * Modify `SavepointLoader` to check whether the unmapped state is actually empty
    * Modify `AbstractOperatorRestoreTestBase` to allow subclasses to set the `--allowNonRestoredState`
    * Add a modified version of `ChainLengthDecreaseTest` to prevent this issue from re-emerging.
    ## Verifying this change
    This change added tests and can be verified as follows:
    Run `ChainLengthStatelessDecreaseTest`. Alternatively, run the reproducer from the JIRA
before and after the change.
    ## Does this pull request potentially affect one of the following parts:
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): no)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: (yes)
    ## Documentation
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)
    This should be merged to 1.3 and master. Note that for 1.3 it may be necessary to backport
the `OperatorSubtaskState#hasState()` method.
    @StefanRRichter @uce 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 7595_14

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5103
commit 083d43b0365705c6d6355d5609da6b812d3ac909
Author: zentol <chesnay@apache.org>
Date:   2017-09-06T13:38:20Z

    [FLINK-7595] [Savepoints] Allow removing stateless operators

commit 2e6bad3ee473aeb76c323d826d1f57b51f2968a2
Author: zentol <chesnay@apache.org>
Date:   2017-11-29T17:57:51Z


commit d0674e8c135ab195ed91ef0ebbcb82f9e9aec79e
Author: zentol <chesnay@apache.org>
Date:   2017-11-29T18:14:18Z

    make task stateful in SavepointLoaderTest


> Removing stateless task from task chain breaks savepoint restore
> ----------------------------------------------------------------
>                 Key: FLINK-7595
>                 URL: https://issues.apache.org/jira/browse/FLINK-7595
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>            Reporter: Ufuk Celebi
>            Assignee: Chesnay Schepler
>         Attachments: ChainedTaskRemoveTest.java
> When removing a stateless operator from a 2-task chain where the head operator is stateful
breaks savepoint restore with 
> {code}
> Caused by: java.lang.IllegalStateException: Failed to rollback to savepoint /var/folders/py/s_1l8vln6f19ygc77m8c4zhr0000gn/T/junit1167397515334838028/junit8006766303945373008/savepoint-cb0bcf-3cfa67865ac0.
Cannot map savepoint state...
> {code}

This message was sent by Atlassian JIRA

View raw message