flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victor Wong (Jira)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-14653) Job-related errors in snapshotState do not result in job failure
Date Tue, 26 Nov 2019 12:38:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982434#comment-16982434
] 

Victor Wong commented on FLINK-14653:
-------------------------------------

[~mxm], any progress on this?

I have some solutions, do you mind taking a look:

 

*Solution 1:*

catch the exception of `CheckpointedFunction#snapshotState` and rethrow as *Error* like
the patch of Beam did. ** 

 

*Solution 2:*

catch the exception of `CheckpointedFunction#snapshotState` and rethrow as a new exception
type, e.g. *SnapshotStateException*, and catch SnapshotStateException later to not mark CheckpointFailureReason
as CHECKPOINT_DECLINED, so it would not be ignored even if the user has set his job to tolerate
checkpointing failures.

> Job-related errors in snapshotState do not result in job failure
> ----------------------------------------------------------------
>
>                 Key: FLINK-14653
>                 URL: https://issues.apache.org/jira/browse/FLINK-14653
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>            Reporter: Maximilian Michels
>            Priority: Minor
>
> When users override {{snapshoteState}}, they might include logic there which is crucial
for the correctness of their application, e.g. finalizing a transaction and buffering the
results of that transaction, or flushing events to an external store. Exceptions occurring
should lead to failing the job.
> Currently, users must make sure to throw a {{Throwable}} because any {{Exception}} will
be caught by the task and reported as checkpointing error, when it could be an application
error.
> It would be helpful to update the documentation and introduce a special exception that
can be thrown for job-related failures, e.g. {{ApplicationError}} or similar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message