kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From João Nuno Silva <jns...@gmail.com>
Subject Re: Active-active inter-cluster synchronization
Date Mon, 06 Oct 2014 17:12:26 GMT
Hi,
​
After some initial brainstorming on IRC channel (thanks @gazarsgo and
others) I will POC the architecture depicted in the picture below:



To give further context, my main requirements are:
    1) the application should only interact with topics local on the DC
    2) keep the data arriving at the applications in both DCs in sync
(consuming the client messages in order is desired but not a hard
requirement, so syncing may get behind)
    3) avoid batching when syncing because my main synchronization
requirement is latency and not throughput

Also keep in mind that it must be tolerant to temporary lack of
connectivity between DCs.

Please critique on this architecture if you think I am overlooking some
problems. Thank you!

On Mon, Oct 6, 2014 at 4:29 PM, Neha Narkhede <neha.narkhede@gmail.com>
wrote:

> If I understood correctly, this means having 2 clusters in each data center
> (DC) one for writes and another for reads. The mirrormaker tool would then
> be set up to mirror both the write clusters of DC1 and DC2 to each of the
> read clusters. Am I right?
>
> Right, though if there are readers that need access to locally written
> data, they could read from the "write" cluster.
>
> On Mon, Oct 6, 2014 at 3:59 AM, João Nuno Silva <jnss81@gmail.com> wrote:
>
> > Hi,
> >
> > I'm evaluating Kafka for an active-active inter-cluster synchronization
> > scenario (with network partition tolerance). I've read most of what I
> could
> > find about this and the section that I think sums it up is
> > https://kafka.apache.org/documentation.html#datacenters
> >
> > I would like your opinion about using Kafka for this and the potential
> > pitfalls that I might encounter if I go down this road.
> >
> > I would also like to know if the approach recommended in the link I sent
> is
> > still the recommended one "For applications that need a global view of
> all
> > data you can use mirroring to provide clusters which have aggregate data
> > mirrored from the local clusters in all datacenters."
> >
> > If I understood correctly, this means having 2 clusters in each data
> center
> > (DC) one for writes and another for reads. The mirrormaker tool would
> then
> > be set up to mirror both the write clusters of DC1 and DC2 to each of the
> > read clusters. Am I right?
> >
> > Thank you for your help!
> >
>

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message