flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "vinoyang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9465) Separate timeout for savepoint and checkpoint
Date Tue, 28 May 2019 15:36:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16849850#comment-16849850

vinoyang commented on FLINK-9465:

[~till.rohrmann] we meet serval times, but not very often (we mainly use checkpoint). It's
better to listen to the details from [~kien_truong]. IMO, it sounds that it is valuable too.

> Separate timeout for savepoint and checkpoint
> ---------------------------------------------
>                 Key: FLINK-9465
>                 URL: https://issues.apache.org/jira/browse/FLINK-9465
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.5.0
>            Reporter: Truong Duc Kien
>            Assignee: vinoyang
>            Priority: Major
> Savepoint can take much longer time to perform than checkpoint, especially with incremental
checkpoint enabled. This leads to a couple of troubles:
>  * For our job, we currently have to set the checkpoint timeout much large than necessary,
otherwise we would be unable to perform savepoint. 
>  * During rush hour, our cluster would encounter high rate of checkpoint timeout due
to backpressure, however we're unable to migrate to a larger configuration, because savepoint
also timeout.
> In my opinion, the timeout for savepoint should be configurable separately, both in the
config file and as parameter to the savepoint command.

This message was sent by Atlassian JIRA

View raw message