flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8144) Optimize the timer logic in RowTimeUnboundedOver
Date Fri, 24 Nov 2017 12:05:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265213#comment-16265213
] 

ASF GitHub Bot commented on FLINK-8144:
---------------------------------------

Github user dianfu commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5063#discussion_r152955709
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/aggregate/RowTimeUnboundedOver.scala
---
    @@ -116,7 +116,7 @@ abstract class RowTimeUnboundedOver(
         // discard late record
         if (timestamp > curWatermark) {
           // ensure every key just registers one timer
    -      ctx.timerService.registerEventTimeTimer(curWatermark + 1)
    +      ctx.timerService.registerEventTimeTimer(timestamp)
    --- End diff --
    
    @fhueske Thanks a lot for your comments. Your concern makes sense to me. I think the current
implementation is ok under periodic watermark. But I'm not sure if it's optimal under punctuated
watermark. We will perform some performance test for unbounded over under punctuated watermark
and share the results.


> Optimize the timer logic in RowTimeUnboundedOver
> ------------------------------------------------
>
>                 Key: FLINK-8144
>                 URL: https://issues.apache.org/jira/browse/FLINK-8144
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>            Reporter: Dian Fu
>            Assignee: Dian Fu
>             Fix For: 1.5.0
>
>
> Currently the logic of {{RowTimeUnboundedOver}} is as follows:
> 1) When element comes, buffer it in MapState and and register a timer at {{current watermark
+ 1}}
> 2) When event timer triggered, scan the MapState and find the elements below the current
watermark and process it. If there are remaining elements to process, register a new timer
at {{current watermark + 1}}.
> Let's assume that watermark comes about 5 seconds later than the event on average, then
we will scan about 5000 times the MapState before actually processing the events.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message