kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hrishikesh Mishra <sd.hri...@gmail.com>
Subject Re: Kafka Group coordinator discovery failing for subsequent restarts
Date Thu, 29 Aug 2019 06:45:05 GMT
Please find my reply in blue colour:



On Thu, Aug 29, 2019 at 11:32 AM Lisheng Wang <wanglisheng81@gmail.com>
wrote:

> Hi
>
> about question 1, it's dosen't matter that how many consumers in same
> consumer group.
>
> So you means the broker which is coordinator did not crashed at all before?
>

 We didn't see any shutdown error on Brokers & we faced similar problem
with multiple coordinators.



> May i know if only exact one broker(coordinator) is unavailable or many
> are? if only exact one, you can try to transfer leader of _consumer_offset
> which on that broker to another broker to see if it's no problem any more?
>
>
It happened with multiple consumer groups.




> i found the following issue seems similar with yours, FYR:
>
>
> https://stackoverflow.com/questions/51952398/kafka-connect-distributed-mode-the-group-coordinator-is-not-available
>

We have gone through this link, but in our case it not feasible always to
clean data from offset topic and restart (our cluster size is huge).


Best,
> Lisheng
>
>
> Hrishikesh Mishra <sd.hrishi@gmail.com> 于2019年8月29日周四 下午12:19写道:
>
> > Hi,
> >
> > We are facing following issues with Kafka cluster.
> >
> >    - Kafka Version: 2.0.0
> >    - We following cluster configuration:
> >    - Number of Broker: 14
> >    - Per Broker: 37GB Memory and 14 Cores.
> >    - Topics: 40 - 50
> >    - Partitions per topic: 32
> >    - Replicas: 3
> >    - Min In Sync Replica: 2
> >    - __consumer_topic partition: 50
> >    - offsets.topic.replication.factor=3
> >    - default.replication.factor=3
> >    - Consumers#: ~4000 (will grow to ~7K)
> >    - Consumer Groups#: ~4000  (will grow to ~7K)
> >
> >
> > Imp: Here one consumer is consuming from one topic  and one consumer
> group
> > has only one consumer due to some architectural constraints.
> >
> > Two major  problems we are facing with consumer group:
> >
> >    - First time when we are starting consumer with new group name it
> >    working very well. But subsequent restart (with previous / older group
> >    name) is causing problems from some consumers. We are getting
> following
> >    errors:
> >
> >    INFO  [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]:
> [Consumer
> >    clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
> >    groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2]
> > Discovered
> >    group coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null)
> >    INFO  [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]:
> [Consumer
> >    clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
> >    groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Group
> >    coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null) is
> > unavailable
> >    or invalid, will attempt rediscovery
> >    INFO  [2019-08-28 19:05:34,582] [main] [AbstractCoordinator]:
> [Consumer
> >    clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
> >    groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2]
> > Discovered
> >    group coordinator 10.32.197.112:9092 (id: 2147483631 rack: null)
> >
> >    These  messages are keep coming and consumer not able to start / poll.
> >    But if we change the group name then it works first time without any
> > issue
> >    (and fails in subsequent restart). So it also means that there is no
> > with
> >    issue broker. Will it because of having single consumer in consumer
> > group,
> >    if yes  then what will be the work around here?
> >
> >    - The second error, we are getting when consumer is up and running.
> Then
> >    after couple hours, it starts failing and throwing following error:
> >    Consumer clientId=banneXXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX,
> >    groupId=bannerXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX] Offset commit
> > failed
> >    on partition banneXXXX-7 at offset 13711176: This is not the correct
> >    coordinator
> >    [Consumer
> >
> >
> clientId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2,
> >
> groupId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2]
> >    Offset commit failed on partition banXXerGrXXMXX-8 at offset 14741:
> > This is
> >    not the correct coordinator.
> >
> >
> > I wanted to know following things:
> >
> >    - What is the max limit of consumer groups in a Kafka cluster, I
> didn't
> >    find any limitation on internet, all places it mentioned that limited
> > by OS.
> >    - Is there a problem of a consumer group has only one consumer.
> >    - Is there some problem with my Kafka configuration,
> >
> >
> >
> >
> > Regards
> > Hrishikesh
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message