kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh singh <getas...@gmail.com>
Subject Re: Partition Reassignment is getting stuck
Date Fri, 15 Nov 2019 05:47:41 GMT
just restart of broker didn't help.  I  deleted couple of random partitions
from the data directory which were under replicated. I also noticed that
their timestamp was 4 days old. After deleting them and restarting the
broker all of the other topics got synced up.

May be it was the case of offline directory.  I will check this metric
offlineLogDirectoryCount and probably put a monitoring on this.
Thank you all.


Thanks
Ashu



On Thu, Nov 14, 2019 at 3:07 AM Liam Clarke <liam.clarke@adscale.co.nz>
wrote:

> If only one broker isn't in sync, it can caused by a dead replica fetcher
> thread in my experience. I fixed it by restarting the affected broker, but
> this was on 0.11, so YMMV.
>
>
>
> On Thu, Nov 14, 2019 at 9:35 AM Koushik Chitta
> <kchitta@microsoft.com.invalid> wrote:
>
> > The topic partition having the ISR issue might be on a offline directory.
> > Look into the metric "offlineLogDirectoryCount" or use  kafka-log-dirs.sh
> > to understand the issue with that directory. In most cases, it would be
> the
> > a KafkaStorage Exception.
> > The partition reassignment would also be stuck/waiting because of this,
> > when the reassignment json contains an offline directory .
> >
> >
> > -----Original Message-----
> > From: M. Manna <manmedia@gmail.com>
> > Sent: Wednesday, November 13, 2019 5:23 AM
> > To: Kafka Users <users@kafka.apache.org>
> > Subject: Re: Partition Reassignment is getting stuck
> >
> > On Wed, 13 Nov 2019 at 13:10, Ashutosh singh <getashu1@gmail.com> wrote:
> >
> > > Yeah, Although it wouldn't have any impact but I will have to try this
> > > tonight as it is peak business hours now.
> > >  Instead deleting all data I will try to delete topic partitions which
> > > are having issues and then restart the broker.  I believe it should
> > > catch up but I will let you know.
> > >
> >
> >  Since you're doing it OOB hours, it should be fine. The issue you're
> > mentioning here is not uncommon, but such occurrence should be close to
> > minuscule. As long as you have >=3 replicas you should be able to do this
> > comfortably.
> >
> > Thanks,
> >
> > >
> > >
> > >
> > > On Wed, Nov 13, 2019 at 6:23 PM M. Manna <manmedia@gmail.com> wrote:
> > >
> > > > On Wed, 13 Nov 2019 at 12:41, Ashutosh singh <getashu1@gmail.com>
> > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > All of a  sudden I see under replicated partition in our Kafka
> > > > > cluster
> > > > and
> > > > > it is not getting replicated.  It seems it is getting stuck
> > somewhere.
> > > In
> > > > > sync replica is missing only form one of the broker it seems there
> > > > > is
> > > > some
> > > > > issue with that broker but other hand there are many others topic
> > > > > on
> > > that
> > > > > node and they are working fine.  I have tried rolling restart of
> > > > > all
> > > the
> > > > > nodes in cluster but that didn't help.
> > > > > I tried manual reassignment of that particular topic but that is
> > > getting
> > > > > stuck forever.  So I had to kill the reassignment by deleting
> > > > > /admin/reassign_partitions  node.  I restarted zookeeper so that
> > > > > leader gets change and then tried to reassign partitions but again
> > > > > it is
> > > getting
> > > > > stuck.
> > > > >
> > > > > I really appreciate if someone can help to understand the issue.
> > > > >
> > > >
> > > > If all you have is 1 broker not in sync - can you please try to stop
> > > > that broker, delete all the data files on that broker, and restart?
> > > > It should catch up.
> > > >
> > > >
> > > > >
> > > > > No of nodes : 8
> > > > > Version : 2.1.1
> > > > >
> > > > > --
> > > > > Thanks
> > > > > Ashu
> > > > >
> > > >
> > >
> > >
> > > --
> > > Thanx & Regard
> > > Ashutosh Singh
> > > 08151945559
> > >
> >
>


-- 
Thanx & Regard
Ashutosh Singh
08151945559

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message