kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: org.apache.zookeeper.KeeperException$BadVersionException
Date Wed, 11 Jun 2014 04:26:28 GMT
The info from kafka-topics is the correct one. Is broker 5 dead? It seems
that you can issue metadata request to it.

Thanks,

Jun


On Tue, Jun 10, 2014 at 8:26 PM, Bongyeon Kim <bongyeon.kim@gmail.com>
wrote:

> With some WARN log, Yes, it is.
>
> and I found interesting things separately before I mentioned.
> I have another clusters. I run 2 brokers on 1 machine for test. and I see
> same problem before I mentioned, but I can’t see any error log on
> controller.log.
>
> At this time, when I list topics with kafka-topic tool, I see information
> like below.
>
> == kafka-topics.sh ===================================================
>
> Topic:topicTRACE        PartitionCount:2        ReplicationFactor:2
> Configs:retention.ms=3600000
>         Topic: topicTRACE       Partition: 0    Leader: 6       Replicas:
> 5,6   Isr: 6
>         Topic: topicTRACE       Partition: 1    Leader: 6       Replicas:
> 6,5   Isr: 6,5
>
> ======================================================================
>
> but, producer keeps producing broker 5 which seems to be dead.
>
> when I get metadata from broker using my version of java api tool, info of
> leader and isr is different.
> Speaking properly, metadata from one broker is same as data from
> kafka-topics tool, metadata from another broker is different.
>
> ========================================================================
>
> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> c-ccp-tk1-a60:9091 topicTRACE
> ---- topic info ----
> partition: 0, leader: 5, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> id:6,host:c-ccp-tk1-a60,port:9092], isr:
> [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092]
> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
>
> ========================================================================
>
> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> c-ccp-tk1-a60:9092 topicTRACE
> ---- topic info ----
> partition: 0, leader: 6, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> id:6,host:c-ccp-tk1-a60,port:9092], isr: [id:6,host:c-ccp-tk1-a60,port:9092]
> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
>
> ========================================================================
>
> which one is correct? why is it happened?
>
>
> Thanks~
>
>
>
> On Jun 10, 2014, at 11:28 PM, Jun Rao <junrao@gmail.com> wrote:
>
> > Ok. Was this host (broker id:1,host:c-ccp-tk1-a58,port:9091) up when the
> > controller had SocketTimeoutException?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Jun 9, 2014 at 10:11 PM, Bongyeon Kim <bongyeon.kim@gmail.com>
> > wrote:
> >
> >> No, I can see any ZK session expiration log.
> >>
> >> What I have to do to prevent this? Increasing '
> >> zookeeper.session.timeout.ms'
> >> can help?
> >>
> >>
> >> On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <junrao@gmail.com> wrote:
> >>
> >>> This is probably related to kafka-1382. The root cause is likely ZK
> >> session
> >>> expiration in the broker. Did you see any?
> >>>
> >>> Thanks,
> >>>
> >>> Jun
> >>>
> >>>
> >>> On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bongyeon.kim@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi, team.
> >>>>
> >>>> I’m using 0.8.1.
> >>>> I found some strange log repeatedly on server.log in one of my brokers
> >>> and
> >>>> it keeps logging until now.
> >>>>
> >>>> server.log
> >>>>
> >>>
> >>
> ======================================================================================
> >>>> ...
> >>>> [2014-06-09 10:41:47,402] ERROR Conditional update of path
> >>>> /brokers/topics/topicTRACE/partitions/1/state with data
> >>>>
> >>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
> >>>> and expected version 439 failed due to
> >>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>> KeeperErrorCode =
> >>>> BadVersion for /brokers/topics/topicTRACE/partitions/1/state
> >>>> (kafka.utils.ZkUtils$)
> >>>> [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
> >>>> Cached zkVersion [439] not equal to that in zookeeper, skip updating
> >> ISR
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
> >>>> Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,416] ERROR Conditional update of path
> >>>> /brokers/topics/topicDEBUG/partitions/0/state with data
> >>>>
> >>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
> >>>> and expected version 1424 failed due to
> >>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>> KeeperErrorCode =
> >>>> BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
> >>>> (kafka.utils.ZkUtils$)
> >>>> [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
> >>>> Cached zkVersion [1424] not equal to that in zookeeper, skip updating
> >> ISR
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
> >>>> Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,435] ERROR Conditional update of path
> >>>> /brokers/topics/topicCDR/partitions/3/state with data
> >>>>
> >>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
> >>>> and expected version 541 failed due to
> >>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>> KeeperErrorCode =
> >>>> BadVersion for /brokers/topics/topicCDR/partitions/3/state
> >>>> (kafka.utils.ZkUtils$)
> >>>> [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2:
> >> Cached
> >>>> zkVersion [541] not equal to that in zookeeper, skip updating ISR
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
> >>>> Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
> >>>> (kafka.cluster.Partition)
> >>>> ...
> >>>>
> >>>>
> >>>
> >>
> =================================================================================================
> >>>>
> >>>> and found some error and warning in controller.log
> >>>>
> >>>>
> >>>> controller.log
> >>>>
> >>>
> >>
> ======================================================================================
> >>>> ...
> >>>> [2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread],
> >>>> Controller 3 fails to send a request to broker
> >>>> id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
> >>>> java.net.SocketTimeoutException
> >>>>        at
> >>>>
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> >>>>        at
> >>> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> >>>>        at
> >>>>
> >>>
> >>
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> >>>>        at kafka.utils.Utils$.read(Utils.scala:375)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> >>>>        at
> >>>> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> >>>>        at
> >>> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
> >>>>        at
> >>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >>>> [2014-06-09 10:42:03,964] ERROR
> [Controller-3-to-broker-1-send-thread],
> >>>> Controller 3 epoch 21 failed to send UpdateMetadata request with
> >>>> correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
> >>> Reconnecting
> >>>> to broker. (kafka.controller.RequestSendThread)
> >>>> java.nio.channels.ClosedChannelException
> >>>>        at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> >>>>        at
> >>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >>>>
> >>>> ...
> >>>>
> >>>> [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No
> >>> broker
> >>>> in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers
> 3.
> >>>> There's potential data loss.
> >>>> (kafka.controller.OfflinePartitionLeaderSelector)
> >>>> ...
> >>>>
> >>>>
> >>>
> >>
> =================================================================================================
> >>>>
> >>>> Why is this happen? Is there any possibilities data loss?
> >>>> To normalize my brokers, What I have to do? Do I have to restart this
> >>>> broker?
> >>>>
> >>>>
> >>>> Thanks in advance.
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> *Sincerely*
> >> *,**Bongyeon Kim*
> >>
> >> Java Developer & Engineer
> >> Seoul, Korea
> >> Mobile:  +82-10-9369-1314
> >> Email:  bongyeonkim@gmail.com
> >> Twitter:  http://twitter.com/tigerby
> >> Facebook:  http://facebook.com/tigerby
> >> Wiki: http://tigerby.com
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message