flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5479) Per-partition watermarks in FlinkKafkaConsumer should consider idle partitions
Date Tue, 06 Mar 2018 12:45:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16387700#comment-16387700
] 

ASF GitHub Bot commented on FLINK-5479:
---------------------------------------

Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/5634
  
    I would suggest to approach this in a different way.
    
      1. Idleness detection is something that watermark generation benefits from in general,
not just in Kafka
      2. Unless there is a very strong reason, I would not want to add anything anymore to
the Kafka Connector. This connector implementation is so big already. We saw multiple issues
in the past, where the Kafka Connector's complexity was the cause of problems.


> Per-partition watermarks in FlinkKafkaConsumer should consider idle partitions
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-5479
>                 URL: https://issues.apache.org/jira/browse/FLINK-5479
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kafka Connector
>            Reporter: Tzu-Li (Gordon) Tai
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> Reported in ML: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-topic-partition-skewness-causes-watermark-not-being-emitted-td11008.html
> Similar to what's happening to idle sources blocking watermark progression in downstream
operators (see FLINK-5017), the per-partition watermark mechanism in {{FlinkKafkaConsumer}}
is also being blocked of progressing watermarks when a partition is idle. The watermark of
idle partitions is always {{Long.MIN_VALUE}}, therefore the overall min watermark across all
partitions of a consumer subtask will never proceed.
> It's normally not a common case to have Kafka partitions not producing any data, but
it'll probably be good to handle this as well. I think we should have a localized solution
similar to FLINK-5017 for the per-partition watermarks in {{AbstractFetcher}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message