flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3660) Measure latency of elements and expose it through web interface
Date Fri, 09 Sep 2016 08:52:20 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476381#comment-15476381
] 

ASF GitHub Bot commented on FLINK-3660:
---------------------------------------

Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2386#discussion_r78147868
  
    --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/DirectedOutput.java
---
    @@ -95,11 +100,17 @@ public DirectedOutput(
     
     	@Override
     	public void emitWatermark(Watermark mark) {
    -		for (Output<StreamRecord<OUT>> out : allOutputs) {
    +		for (Output<StreamRecord<OUT>> out: allOutputs) {
    --- End diff --
    
    I think the space was good here.


> Measure latency of elements and expose it through web interface
> ---------------------------------------------------------------
>
>                 Key: FLINK-3660
>                 URL: https://issues.apache.org/jira/browse/FLINK-3660
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Streaming
>            Reporter: Robert Metzger
>            Assignee: Robert Metzger
>             Fix For: pre-apache
>
>
> It would be nice to expose the end-to-end latency of a streaming job in the webinterface.
> To achieve this, my initial thought was to attach an ingestion-time timestamp at the
sources to each record.
> However, this introduces overhead for a monitoring feature users might not even use (8
bytes for each element + System.currentTimeMilis() on each element).
> Therefore, I suggest to implement this feature by periodically sending special events,
similar to watermarks through the topology. 
> Those {{LatencyMarks}} are emitted at a configurable interval at the sources and forwarded
by the tasks. The sinks will compare the timestamp of the latency marks with their current
system time to determine the latency.
> The latency marks will not add to the latency of a job, but the marks will be delayed
similarly than regular records, so their latency will approximate the record latency.
> Above suggestion expects the clocks on all taskmanagers to be in sync. Otherwise, the
measured latencies would also include the offsets between the taskmanager's clocks.
> In a second step, we can try to mitigate the issue by using the JobManager as a central
timing service. The TaskManagers will periodically query the JM for the current time in order
to determine the offset with their clock.
> This offset would still include the network latency between TM and JM but it would still
lead to reasonably good estimations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message