flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Ewen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3018) Job state discarded from web interface for restarting jobs
Date Tue, 17 Nov 2015 14:26:11 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008728#comment-15008728

Stephan Ewen commented on FLINK-3018:

There are multiple execution attempts in cases of failures, they are all remembered, but the
UI shows only the latest one.

For batch jobs, that makes sense. For streaming jobs, I am unsure. To get this right, we would
need to include the metrics in the checkpointing, which is a major change. The metrics are
currently communicates asynchronously, as part of heartbeat-like messages, they interfere
with nothing.

I disagree that this is critical, though. Accumulated numbers mean little in many continuous
streaming, settings, where metrics like throughput and latency mean much more, and look at
the recent stats anyways.

> Job state discarded from web interface for restarting jobs
> ----------------------------------------------------------
>                 Key: FLINK-3018
>                 URL: https://issues.apache.org/jira/browse/FLINK-3018
>             Project: Flink
>          Issue Type: Improvement
>          Components: Webfrontend
>            Reporter: Gyula Fora
>            Priority: Critical
> When a streaming job goes into the restarting status and recovers, the web ui information
is totally lost (number of records/bytes processed etc).
> This is very misleading and should not happen.

This message was sent by Atlassian JIRA

View raw message