flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Rohrmann (Jira)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-20427) Remove CheckpointConfig.setPreferCheckpointForRecovery because it can lead to data loss
Date Thu, 20 May 2021 08:03:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-20427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17348138#comment-17348138

Till Rohrmann commented on FLINK-20427:

I think this is still a very important issue to sort out and fix because it can lead to unwanted
data loss.

> Remove CheckpointConfig.setPreferCheckpointForRecovery because it can lead to data loss
> ---------------------------------------------------------------------------------------
>                 Key: FLINK-20427
>                 URL: https://issues.apache.org/jira/browse/FLINK-20427
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream, Runtime / Checkpointing
>    Affects Versions: 1.12.0
>            Reporter: Till Rohrmann
>            Priority: Critical
>             Fix For: 1.14.0
> The {{CheckpointConfig.setPreferCheckpointForRecovery}} allows to configure whether Flink
prefers checkpoints for recovery if the {{CompletedCheckpointStore}} contains savepoints and
checkpoints. This is problematic because due to this feature, Flink might prefer older checkpoints
over newer savepoints for recovery. Since some components expect that the always the latest
checkpoint/savepoint is used (e.g. the {{SourceCoordinator}}), it breaks assumptions and can
lead to {{SourceSplits}} which are not read. This effectively means that the system loses
data. Similarly, this behaviour can cause that exactly once sinks might output results multiple
times which violates the processing guarantees. Hence, I believe that we should remove this
setting because it changes Flink's behaviour in some very significant way potentially w/o
the user noticing.

This message was sent by Atlassian Jira

View raw message