flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09
Date Tue, 04 Oct 2016 02:59:20 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15544170#comment-15544170

Tzu-Li (Gordon) Tai commented on FLINK-4722:

One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 subtasks) for
the FlinkKafkaConsumer. Each subtask will be assigned a single partition and read only that
partition. Using a {{KeyedDeserializationSchema}}, you can key each data record with the partition
id "within the source". That means you won't need another map function to do this keying.
The data that comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source. I think, in use
cases like yours, you would benefit if the returned stream from the source is already a {{KeyedStream}},
to take advantage of pre-partitioned data outside of Flink. I'll think a bit about this idea!

In any way, writing your own custom consumer also works for now :)

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> -----------------------------------------------------------------------
>                 Key: FLINK-4722
>                 URL: https://issues.apache.org/jira/browse/FLINK-4722
>             Project: Flink
>          Issue Type: Bug
>          Components: Kafka Connector
>    Affects Versions: 1.1.2
>            Reporter: Sudhanshu Sekhar Lenka
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to that same
topic using "group.id" ,"myGroup" property . Still flink consumer get all data which are push
to each 3   partition . While it work properly with normal java consumer. each consumer get
specific data.

This message was sent by Atlassian JIRA

View raw message