kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suman B N <sumannew...@gmail.com>
Subject Re: Kafka replication error
Date Thu, 06 Dec 2018 15:38:59 GMT
+users

On Thu, Dec 6, 2018 at 9:01 PM Suman B N <sumannewton@gmail.com> wrote:

> Team,
>
> We are observing ISR shrink and expand very frequently. In the logs of the
> follower, below errors are observed:
>
> [2018-12-06 20:00:42,709] WARN [ReplicaFetcherThread-2-15], Error in fetch
> kafka.server.ReplicaFetcherThread$FetchRequest@a0f9ba9
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 15 was disconnected before the response
> was read
>         at
> kafka.utils.NetworkClientBlockingOps$.$anonfun$blockingSendAndReceive$3(NetworkClientBlockingOps.scala:114)
>         at
> kafka.utils.NetworkClientBlockingOps$.$anonfun$blockingSendAndReceive$3$adapted(NetworkClientBlockingOps.scala:112)
>         at scala.Option.foreach(Option.scala:257)
>         at
> kafka.utils.NetworkClientBlockingOps$.$anonfun$blockingSendAndReceive$1(NetworkClientBlockingOps.scala:112)
>         at
> kafka.utils.NetworkClientBlockingOps$.recursivePoll$1(NetworkClientBlockingOps.scala:136)
>         at
> kafka.utils.NetworkClientBlockingOps$.pollContinuously$extension(NetworkClientBlockingOps.scala:142)
>         at
> kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:108)
>         at
> kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:249)
>         at
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:234)
>         at
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
>         at
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
>         at
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
>
> Can someone explain this? And help us understand how we can resolve these
> under-replicated partitions.
>
> server.properties file:
> broker.id=15
> port=9092
> zookeeper.connect=zk1,zk2,zk3,zk4,zk5,zk6
>
> default.replication.factor=2
> log.dirs=/data/kafka
> delete.topic.enable=true
> zookeeper.session.timeout.ms=10000
> inter.broker.protocol.version=0.10.2
> num.partitions=3
> min.insync.replicas=1
> log.retention.ms=259200000
> message.max.bytes=20971520
> replica.fetch.max.bytes=20971520
> replica.fetch.response.max.bytes=20971520
> max.partition.fetch.bytes=20971520
> fetch.max.bytes=20971520
> log.flush.interval.ms=5000
> log.roll.hours=24
> num.replica.fetchers=3
> num.io.threads=8
> num.network.threads=6
> log.message.format.version=0.9.0.1
>
> Also In what cases we lead to this state? We have 1200-1400 topics and
> 5000-6000 partitions spread across 20 node cluster. But only 30-40
> partitions are under-replicated while rest are in-sync. 95% of these
> partitions are having 2 replication factor.
>
> --
> *Suman*
>


-- 
*Suman*
*OlaCabs*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message