kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neha Narkhede <neha.narkh...@gmail.com>
Subject Re: fidelity of offsets when mirroring
Date Wed, 05 Mar 2014 17:30:56 GMT
Jun's suggested design is the closest you can get to achieving a AZ failure
with mirroring. However, one thing I'd like to point out about the
getOffsetsBefore API is the fact that it gives you the approximate offset
for a particular time t. For example, if you ask for an offset of a message
produced at time t, it may give you the offset for a message that was
produced at time (t - t`). The only guarantee it provides is that the
offset returned will be for a message that was produced *before* time t.

What this means for mirroring is that during a failover you can get
duplicate messages.


On Tue, Mar 4, 2014 at 8:42 PM, Jun Rao <junrao@gmail.com> wrote:

> Currently, message offsets are not preserved by mirror maker.
> You can potentially do the failover based on the failover time. Suppose
> that the consumption in A failed at time t. You find the offset before time
> t using our getOffsetBefore api to get the starting offset in B. Then, you
> have to manually import these offsets into ZK and then start the consumer.
> Thanks,
> Jun
> On Tue, Mar 4, 2014 at 3:23 PM, Seth White <seth.white@salesforce.com
> >wrote:
> > Hi,
> >
> > I have a question about mirroring.   I would like to create a highly
> > available Kafka service that runs on AWS and can survive an AZ failure.
> > Based on what I've read, I plan to create a Kafka cluster in each AZ and
> > use mirror maker to replicate one cluster to the other.   I'll call the
> two
> > clusters in their respective availability zones A and B. A is the primary
> > which is replicated to B.  Normally, all consumers consume from A and
> > record their current offset in a persistent store that is replicated
> across
> > A and B (like Dynamo).   If I detect that A  has failed producers and
> > consumers will fail over to B.   That's the basic idea.
> >
> > Now, the question:   Can I rely on the offset that is being stored in the
> > persistent store to refer to the same event in each cluster?   Or is it
> > possible for the two to get out of sync over time - I don't know why,
> > failures of some kind maybe - in which case the offset from A  might not
> > really be valid with respect to the replica B.   If that is possible,
> then
> > I'm wondering what I can/should do about it  to achieve a clean failover.
> > I realize that the replication may lag behind, so some events from A
>  make
> > be lost when there is a failover. That is okay.
> >
> > I've been told that creating a single cluster that spans AZs  and relying
> > on the new replication functionality in 0.8 is a bad idea, as zookeeper
> > isn't well behaved in that case.   Hence my alternative design.
> >
> > Thanks in advance.
> > Seth
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message