kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From svante karlsson <s...@csi.se>
Subject Re: How to handle broker disk failure
Date Wed, 21 Jan 2015 11:11:45 GMT
Is it possible to continue to server topics from the remaining disks while
waiting for a replacement disk or will the broker exit/stop working. (we
would like to be able to replace disks in a relaxed manner since we have
the datacenter colocated and we don't have permanent staff there since
there is simply not enough things to do to motivate 24h staffing)

If we trigger a rebalance during the downtime the under replicated
topics/partitions will hopefully be moved somewhere else? What happens the
when we add the broker again - now with a new empty disk. Will all over
replicated partitions be removed from the reinserted broker and finally
should/must we trigger a rebalance?


2015-01-21 2:56 GMT+01:00 Jun Rao <jun@confluent.io>:

> Actually, you don't need to reassign partitions in this case. You just need
> to replace the bad disk and restart the broker. It will copy the missing
> data over automatically.
> Thanks,
> Jun
> On Tue, Jan 20, 2015 at 1:02 AM, svante karlsson <saka@csi.se> wrote:
> > I'm trying to figure out the best way to handle a disk failure in a live
> > environment.
> >
> > The obvious (and naive) solution is to decommission the broker and let
> > other brokers taker over and create new followers. Then replace the disk
> > and clean the remaining log directories and add the broker again.
> >
> > The disadvantage with this approach is of course the network overhead and
> > the time it takes to reassign partitions.
> >
> > Is there a better way?
> >
> > As a sub question, is it possible to continue running a broker with a
> > failed drive and still serve the remaining partitions?
> >
> > thanks,
> > svante
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message