kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu, Libo " <libo...@citi.com>
Subject RE: failover strategy
Date Mon, 01 Jul 2013 13:52:02 GMT
Thanks again. It seems the 2nd method is not doable.
The downside of the first method is that if the first data
center is down, the second one still lags behind and may
not have all the messages the first one has. We can let
publisher publish to the two data centers at the same 
time. But that may degrade the performance greatly.

Regards,

Libo


-----Original Message-----
From: Jun Rao [mailto:junrao@gmail.com] 
Sent: Sunday, June 30, 2013 11:48 PM
To: users@kafka.apache.org
Subject: Re: failover strategy

LinkedIn uses the first method for cross DC mirroring. For the second method, there are 2
main issues. (1) Kafka depends on the ZK service to be always available. For a ZK cluster
to be available, you need a majority of ZK servers to be up. If you set up a ZK cluster spanning
only 2 data centers, a single DC failure may make the ZK cluster unavailable. You can set
up a ZK cluster spanning 3 or more DCs, which allows you tolerate at least 1 DC failure. (2)
Long network latency across DCs. In order for the follow to keep up with the leader in a different
DC, you need to tune parameters like replica.lag.max.messages, replica.lag.time.max.ms, and
replica.socket.receive.buffer.bytes to amortize the long network latency.

Thanks,

Jun


On Sat, Jun 29, 2013 at 10:50 AM, Yu, Libo <libo.yu@citi.com> wrote:

> The first method may lose message if cluster A is permanently down or 
> cannot restart right away as B always lags behind A. Even with 
> mirroring, B has to wait to get missing msg until A is back. So it is 
> not ideal. What type of solution did you use at linkedin?
>
> Regards,
>
> Libo
>
>
> -----Original Message-----
> From: Joel Koshy [mailto:jjkoshy.w@gmail.com]
> Sent: Friday, June 28, 2013 8:59 PM
> To: users@kafka.apache.org
> Subject: Re: failover strategy
>
> The second method (replication across DCs) is not recommended.
> The first set up would work provided the set of topics you are 
> mirroring from A->B is disjoint from the set of topics you are 
> mirroring from B->A (i.e., to avoid a mirroring loop).
>
> Joel
>
> On Fri, Jun 28, 2013 at 5:29 PM, Yu, Libo <libo.yu@citi.com> wrote:
> > Hi,
> >
> > I can think of two failover strategies. I am not sure which one is 
> > the
> right way to go.
> >
> > First method. set up kafka server A on cluster 1 and set up another
> server B on cluster 2.
> > The two clusters are in different data centers. Use customized 
> > mirrormaker to sync between the two servers. Use one server in 
> > production and use the other one as contingency. If server A is 
> > down,
> server B will be used (this can be transparent to publishers/consumers).
> > There may be a lag between the two servers before server A is down .
> > But after A is back, the customized mirrormaker can sync the two. 
> > And eventually B will have all the data A had before the failure.
> >
> > Second method. Set up one kafka server using cluster 1 and cluster 2.
> > When creating a topic , always use two replications. For each 
> > partition, assign one replication to a broker in cluster 1 and 
> > assign the other replication to a broker in cluster 2. So kafka will 
> > handle the
> syncing and failover for the two clusters. Is that a right (expected) 
> way to use kafka?
> >
> >
> > Regards,
> >
> > Libo
> >
>

Mime
View raw message