kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Clarkson <andrew.clark...@rallyhealth.com>
Subject Re: Kafka Backup Strategy
Date Thu, 22 Dec 2016 06:40:14 GMT
Hi Stephane,

I say this not to be condescending in any way, but simple replication
*might* cover your needs. This will cover most node failures (causing
unclean shutdown) like disk or power failure. This assumes that one of the
replicas of your data survives (see the configs min.insync.replicas, acks,
and log.flush.interval.*). Making sure that you have the correct ack'ing
and replication strategy will likely cover a lot of the failure/recovery
use cases.

If you need better recovery/availability guarantees than simple
replication, the de facto mechanism is "mirroring
<https://kafka.apache.org/documentation.html#basic_ops_mirror_maker>" using
a tool called "mirror maker". This would cover cases where an entire
cluster crashed (like an AWS region being down) or other catastrophic
failures. This is the preferred way to do multi-data center (multi-region)
replication.

Back to EBS snapshots. From what I understand, snapshotting the file system
won't give you a full picture of what's going on because brokers flush the
logs infrequently and, as you mentioned, leave logs in a "corrupted" state.

If you need a persistent record in order to rerun expired data (see the
configs log.retention.*), you might want to look at a tool like Secor
<https://github.com/pinterest/secor>. Secor will write all messages to an
S3 bucket from which you could rerun the data if you need to. Sadly, it
doesn't come with a producer to rerun the data and you would have to write
your own.

Let me know if that helps!

Thanks much,
Andrew Clarkson

On Wed, Dec 21, 2016 at 9:32 PM, Stephane Maarek <
stephane@simplemachines.com.au> wrote:

> Hi,
>
> I have Kafka running on EC2 in AWS.
> I would like to backup my data volumes daily in order to recover to a point
> in time in case of a disaster.
>
> One thing I’m worried about is that if I do an EBS snapshot while Kafka is
> running, it seems a Kafka that recovers on it will have to deal with
> corrupted logs (it goes through a repair / rebuild index process). It seems
> that Kafka on shutdown properly closes the logs.
>
> Questions:
> 1) If I take the EBS snapshots while Kafka is running, is it dangerous that
> a new instance launched from this backup has to go through a repair
> process?
> 2) The other option I see is to stop the Kafka broker, and then take my EBS
> snapshot. But I can’t do that for all brokers simultaneously as I would
> lose my cluster, so therefore if I do: stop kafka broker, take snapshot,
> start kafka, next broker same steps, I would get a clean backup, but not a
> point in time backup… is that an issue?
> 3) Are there any other backup strategies I haven’t considered?
>
> Thanks!
> Stephane
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message