kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewen Cheslack-Postava <e...@confluent.io>
Subject Re: Query on MirrorMaker Replication - Bi-directional/Failover replication
Date Tue, 24 Jan 2017 04:25:07 GMT
On Wed, Jan 18, 2017 at 4:56 PM, Greenhorn Techie <greenhorntechie@gmail.com
> wrote:

> Hi there,
>
> Can anyone please answer to my below follow-up questions for Ewen's
> responses.
>
> Thanks
>
>
> On Tue, 17 Jan 2017 at 00:28 Greenhorn Techie <greenhorntechie@gmail.com>
> wrote:
>
>> Thanks Ewen for the detailed response. This is quite helpful and cleared
>> some of my doubts. However, I do have some follow-up queries. Can you
>> please let me know your thoughts on the same?
>>
>> [Query] Is non-compacted topics a pre-requisite to have this mechanism
>> work as expected? What are the challenges that need to be looked for in
>> case of compacted-topics?
>>
>
There are additional considerations when you are using compacted topics.
The core of the problem is that offsets will differ between either side.
That can be true of non-compacted topics as well, but at least they have a
consistent offset you can rely on (up to lagging too far behind and losing
data). In a compacted topic, the difference between offsets in source and
sink could change frequently depending on a number of factors including
lag, segment size, etc.


> 1. Use MM normally to replicate your data. Be *very* sure you construct
>> your setup to ensure *everything* is mirrored (proper # of partitions,
>> replication factor, topic level configs, etc). (Note that this is something
>> the Confluent replication solution is addressing that's a significant
>> gap in MM.)
>>
> [Query] Here we are planning to use MirrorMaker to do the job for us and
>> hence topic creation etc is expected to be created by MirrorMaker (by
>> setting auto.create.topics.enable=true) Will this work? or will setting
>> auto.create.topics.enable=true create the topic with default settings?
>>
>
MirrorMaker doesn't do this today. This is one of the features Confluent's
Replicator provides, including mirroring all topic level configs. If you
use MM you'll need to find a way to handle that yourself, either by not
allowing MM to replicate data on a topic until you know the destination
topic has been created or using one of MM pluggable components to do so.


> 2. During replication, be sure to record offset deltas for every
>> topic partition. These are needed to reverse the direction of
>> replication correctly. Make sure to store them in the backup DC and
>> somewhere very reliable.
>> [Query] Is there any recommended approach to do this? As I am new to
>> Kafka, wondering if there is a good way of doing this
>>
>
No recommended way. If replication is paused you can just record the high
watermark on both sides.


> 4. Decide to do failover. Ensure replication has actually stopped (via
>> your own tooling, or probably better, by using ACLs to ensure no new data
>> can be
>> produced from original DC to backup DC)
>> [Query] Does stopping replication mean killing the MirrorMaker process?
>> Or is there more needed here? Using ACLs probably, we can ensure the mirror
>> maker service account doesn't have read access on the source cluster and
>> write access on the DR cluster. Is there anything else to be done here?
>>
>
At its core, yes, it means stopping MM. But as you say, you can get fancier
and enforce this with, eg. ACLs.


> 5. Record all the high watermarks for every topic partition so you
>> know which data was replicated from the original DC (vs which is new
>> after failover).
>> [Query] Is there any best practice around this? In the presentation Jun
>> Rao talks about time-stamp based offset recording. As I understand, that
>> would probably help our case, where we can probably produce messages to the
>> DR cluster, from the point of failover
>>
>
Note that the timestamp based offset doesn't give you the perfect reversal
of mirroring that you are looking for. The granularity of timestamps can't
possibly guarantee that (let alone timestamp ordering issues).

Timestamp-based reset in the case of a failover is a good option today
assuming you are on 0.10.1+ brokers.


> 7. Once the original DC is back alive, you want to reverse replication
>> and make it the backup. Lookup the offset deltas, use them to
>> initialize offsets for the consumer group you'll use to do replication.
>> [Query]In order to lookup the offset deltas before initiating the
>> consumers on the original cluster, is there any recommended
>> mechanism/tooling that can be leveraged?
>>
>
There isn't tooling for this, and the intent in this step is to leverage
the deltas you recorded in an earlier step. You'd probably want to write
one tool that handles both of those steps since the output of one step is
the input to the other.

-Ewen


>
>> Best Regards
>>
>> On Fri, 6 Jan 2017 at 03:31 Ewen Cheslack-Postava <ewen@confluent.io>
>> wrote:
>>
>> On Thu, Jan 5, 2017 at 3:07 AM, Greenhorn Techie <
>> greenhorntechie@gmail.com>
>> wrote:
>>
>> > Hi,
>> >
>> > We are planning to setup MirrorMaker based Kafka replication for DR
>> > purposes. The base requirement is to have a DR replication from primary
>> > (site1) to DR site  (site2)using MirrorMaker,
>> >
>> > However, we need the solution to work in case of failover as well i.e.
>> > where in the event of the site1 kafka cluster failing, site2 kafka
>> cluster
>> > would be made primary. Later when site1 cluster eventually comes back-up
>> > online, direction of replication would be from site2 to site1.
>> >
>> > But as I understand, the offsets on each of the clusters are different,
>> so
>> > wondering how to design the solution given this constraint and
>> > requirements.
>> >
>>
>> It turns out this is tricky. And once you start digging in you'll find
>> it's
>> way more complicated than you might originally think.
>>
>> Before going down the rabbit hole, I'd suggest taking a look at this great
>> talk by Jun Rao (one of the original authors of Kafka) about multi-DC
>> Kafka
>> setups: https://www.youtube.com/watch?v=Dvk0cwqGgws
>>
>> Additionally, I want to mention that while it is tempting to want to treat
>> multi-DC DR cases in a way that we get really convenient, strongly
>> consistent, highly available behavior because that makes it easier to
>> reason about and avoids pushing much of the burden down to applications,
>> that's not realistic or practical. And honestly, it's rarely even
>> necessary. DR cases really are DR. Usually it is possible to make some
>> tradeoffs you might not make under normal circumstances (the most
>> important
>> one being the tradeoff between possibly seeing duplicates vs exactly
>> once).
>> The tension here is often that one team is responsible for maintain the
>> infrastructure and handling this DR failover scenario, and others are
>> responsible for the behavior of the applications. The infrastructure team
>> is responsible for figuring out the DR failover story but if they don't
>> solve it at the infrastructure layer then they get stuck having to
>> understand all the current (and future) applications built on that
>> infrastructure.
>>
>> That said, here are the details I think you're looking for:
>>
>> The short answer right now is that doing DR failover like that is not
>> going
>> to be easy with MM. Confluent is building additional tools to deal with
>> multi-DC setups because of a bunch of these challenges:
>> https://www.confluent.io/product/multi-datacenter/
>>
>> For your specific concern about reversing the direction of replication,
>> you'd need to build additional tooling to support this. The basic list of
>> steps would be something like this (assuming non-compacted topics):
>>
>> 1. Use MM normally to replicate your data. Be *very* sure you construct
>> your setup to ensure *everything* is mirrored (proper # of partitions,
>> replication factor, topic level configs, etc). (Note that this is
>> something
>> the Confluent replication solution is addressing that's a significant gap
>> in MM.)
>> 2. During replication, be sure to record offset deltas for every topic
>> partition. These are needed to reverse the direction of replication
>> correctly. Make sure to store them in the backup DC and somewhere very
>> reliable.
>> 3. Observe DC failure.
>> 4. Decide to do failover. Ensure replication has actually stopped (via
>> your
>> own tooling, or probably better, by using ACLs to ensure no new data can
>> be
>> produced from original DC to backup DC)
>> 5. Record all the high watermarks for every topic partition so you know
>> which data was replicated from the original DC (vs which is new after
>> failover).
>> 6. Allow failover to proceed. Make the backup DC primary.
>> 7. Once the original DC is back alive, you want to reverse replication and
>> make it the backup. Lookup the offset deltas, use them to initialize
>> offsets for the consumer group you'll use to do replication.
>> 8. Go back to the original DC and make sure there isn't any "extra" data,
>> i.e. stuff that didn't get replicated but was successfully written to the
>> original DC's cluster. For topic partitions where there is data beyond the
>> expected offsets, you currently would need to just delete the entire set
>> of
>> data, or at least to before the offset we expect to start at. (A truncate
>> operation might be a nice way to avoid having to dump *all* the data, but
>> doesn't currently exist.)
>> 9. Once you've got the two clusters back in a reasonably synced state with
>> appropriate starting offsets committed, start up MM again in the reverse
>> direction.
>>
>> If this sounds tricky, it turns out that when you add compacted topics,
>> things get quite a bit messier....
>>
>> -Ewen
>>
>>
>> >
>> > Thanks
>> >
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message