kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hrishikesh Mishra <sd.hri...@gmail.com>
Subject Kafka Group coordinator discovery failing for subsequent restarts
Date Thu, 29 Aug 2019 04:18:47 GMT
Hi,

We are facing following issues with Kafka cluster.

   - Kafka Version: 2.0.0
   - We following cluster configuration:
   - Number of Broker: 14
   - Per Broker: 37GB Memory and 14 Cores.
   - Topics: 40 - 50
   - Partitions per topic: 32
   - Replicas: 3
   - Min In Sync Replica: 2
   - __consumer_topic partition: 50
   - offsets.topic.replication.factor=3
   - default.replication.factor=3
   - Consumers#: ~4000 (will grow to ~7K)
   - Consumer Groups#: ~4000  (will grow to ~7K)


Imp: Here one consumer is consuming from one topic  and one consumer group
has only one consumer due to some architectural constraints.

Two major  problems we are facing with consumer group:

   - First time when we are starting consumer with new group name it
   working very well. But subsequent restart (with previous / older group
   name) is causing problems from some consumers. We are getting following
   errors:

   INFO  [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]: [Consumer
   clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
   groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Discovered
   group coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null)
   INFO  [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]: [Consumer
   clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
   groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Group
   coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null) is unavailable
   or invalid, will attempt rediscovery
   INFO  [2019-08-28 19:05:34,582] [main] [AbstractCoordinator]: [Consumer
   clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
   groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Discovered
   group coordinator 10.32.197.112:9092 (id: 2147483631 rack: null)

   These  messages are keep coming and consumer not able to start / poll.
   But if we change the group name then it works first time without any issue
   (and fails in subsequent restart). So it also means that there is no with
   issue broker. Will it because of having single consumer in consumer group,
   if yes  then what will be the work around here?

   - The second error, we are getting when consumer is up and running. Then
   after couple hours, it starts failing and throwing following error:
   Consumer clientId=banneXXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX,
   groupId=bannerXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX] Offset commit failed
   on partition banneXXXX-7 at offset 13711176: This is not the correct
   coordinator
   [Consumer
   clientId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2,
   groupId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2]
   Offset commit failed on partition banXXerGrXXMXX-8 at offset 14741: This is
   not the correct coordinator.


I wanted to know following things:

   - What is the max limit of consumer groups in a Kafka cluster, I didn't
   find any limitation on internet, all places it mentioned that limited by OS.
   - Is there a problem of a consumer group has only one consumer.
   - Is there some problem with my Kafka configuration,




Regards
Hrishikesh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message