flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Piotr Nowojski (Jira)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-14814) Show the vertex that produces the backpressure source in the job
Date Mon, 18 Nov 2019 08:39:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-14814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976364#comment-16976364

Piotr Nowojski commented on FLINK-14814:

Having multiple output edges is I think not that often, and even if, one can deduce the state
from the combined output usage basing on the fact that buffers are rarely in other states
than "mostly empty" and "mostly full". Value of {{outputUsage}} jiggling around 50% means
one output is full other is empty. Because of that I wouldn't worry about it too much, at
least not in the first version.

I think the bigger problem is that your screenshot displays the tasks, not individual subtasks/parallel
instances. This rises a question:
# do we want to present non aggregated metrics for subtask?
# do we want to present aggregated metrics for the tasks? ...
# ... if so, how to aggregate the metrics (and who should be doing that)?

1. would be easier to do, significantly more detailed and fine grained, however less user
friendly and more difficult to use.
2. loosing some information in an exchange for a simpler usage

(we might want to do both, or one first, later the other)

3. we would have to decide how to aggregate individual value. For example if one single subtask
is back-pressured, do we report that whole task is back-pressured? For pool usage should we
average them out? Max? Regarding who should be doing that - it shouldn't be the UI, so in
that case we would need one more metric related ticket to actually come up with an idea how
to aggregate the metrics.

> Show the vertex that produces the backpressure source in the job
> ----------------------------------------------------------------
>                 Key: FLINK-14814
>                 URL: https://issues.apache.org/jira/browse/FLINK-14814
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Metrics, Runtime / Network, Runtime / REST, Runtime /
Web Frontend
>            Reporter: lining
>            Assignee: lining
>            Priority: Major
>         Attachments: 2B0E910D-6D95-401F-B450-1F6B1AFB9BEA.png
> By checking the status of output and input buffer pools exposed via FLINK-14815 (output
buffer empty, input buffer full) it is possible to display which node is a source of the back
pressure. This information could be displayed/accessible in the Web Frontend.

This message was sent by Atlassian Jira

View raw message