kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Hamilton <dhamil...@nanigans.com>
Subject Re: Error finding consumer coordinators after restart
Date Fri, 13 Jan 2017 19:24:41 GMT
Just wanted to close the loop on this. It seems the consumer offset logs might have been corrupted
by the system restart. Deleting the topic logs and restarting the Kafka service cleared up
the problem.

Thanks,
Dave



On 1/12/17, 2:29 PM, "Dave Hamilton" <dhamilton@nanigans.com> wrote:

    Hello, we ran into a memory issue on a Kafka 0.10.0.1 broker we are running that required
a system restart. Since bringing Kafka back up it seems the consumers are having issues finding
their coordinators. Here are some errors we’ve seen in our server logs after restarting:
    
    [2017-01-12 19:02:10,178] ERROR [Group Metadata Manager on Broker 0]: Error in loading
offsets from [__consumer_offsets,40] (kafka.coordinator.GroupMetadataManager)
    java.nio.channels.ClosedChannelException
                    at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
                    at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
                    at kafka.log.FileMessageSet.searchFor(FileMessageSet.scala:135)
                    at kafka.log.LogSegment.translateOffset(LogSegment.scala:106)
                    at kafka.log.LogSegment.read(LogSegment.scala:127)
                    at kafka.log.Log.read(Log.scala:532)
                    at kafka.coordinator.GroupMetadataManager$$anonfun$kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1$1.apply$mcV$sp(GroupMetadataManager.scala:380)
                    at kafka.coordinator.GroupMetadataManager$$anonfun$kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1$1.apply(GroupMetadataManager.scala:374)
                    at kafka.coordinator.GroupMetadataManager$$anonfun$kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1$1.apply(GroupMetadataManager.scala:374)
                    at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
                    at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:239)
                    at kafka.coordinator.GroupMetadataManager.kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1(GroupMetadataManager.scala:374)
                    at kafka.coordinator.GroupMetadataManager$$anonfun$loadGroupsForPartition$1.apply$mcV$sp(GroupMetadataManager.scala:353)
                    at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
                    at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:56)
                    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
                    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
                    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
                    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                    at java.lang.Thread.run(Thread.java:744)
    [2017-01-12 19:03:56,468] ERROR [KafkaApi-0] Error when handling request {topics=[__consumer_offsets]}
(kafka.server.KafkaApis)
    kafka.admin.AdminOperationException: replication factor: 1 larger than available brokers:
0
                    at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:117)
                    at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:403)
                    at kafka.server.KafkaApis.kafka$server$KafkaApis$$createTopic(KafkaApis.scala:629)
                    at kafka.server.KafkaApis.kafka$server$KafkaApis$$createGroupMetadataTopic(KafkaApis.scala:651)
                    at kafka.server.KafkaApis$$anonfun$29.apply(KafkaApis.scala:668)
                    at kafka.server.KafkaApis$$anonfun$29.apply(KafkaApis.scala:666)
                    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
                    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
                    at scala.collection.immutable.Set$Set1.foreach(Set.scala:94)
                    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
                    at scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:47)
                    at scala.collection.SetLike$class.map(SetLike.scala:92)
                    at scala.collection.AbstractSet.map(Set.scala:47)
                    at kafka.server.KafkaApis.getTopicMetadata(KafkaApis.scala:666)
                    at kafka.server.KafkaApis.handleTopicMetadataRequest(KafkaApis.scala:727)
                    at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
                    at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
                    at java.lang.Thread.run(Thread.java:744)
    
    Also running the kafka-consumer-groups.sh on a consumer group returns the following:
    
    Error while executing consumer group command This is not the correct coordinator for this
group.
    org.apache.kafka.common.errors.NotCoordinatorForGroupException: This is not the correct
coordinator for this group.
    
    We also see the following logs when trying to restart a Kafka connector:
    
    [2017-01-12 17:44:07,941] INFO Discovered coordinator lxskfkdal501.nanigans.com:9092 (id:
2147483647 rack: null) for group connect-paid_events_s3. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:505)
    [2017-01-12 17:44:07,941] INFO (Re-)joining group connect-paid_events_s3 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:326)
    [2017-01-12 17:44:07,941] INFO Marking the coordinator lxskfkdal501.nanigans.com:9092
(id: 2147483647 rack: null) dead for group connect-paid_events_s3 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:542)
    
    Does anyone have recommendations for what we can do to recover from this issue?
    
    Thanks,
    Dave
    
    

Mime
View raw message