kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Stopford <...@confluent.io>
Subject Re: Possible WAN Replication Setup
Date Sun, 17 Jan 2016 22:29:39 GMT

Don’t forget that Kafka relies on redundant replicas for fault tolerance rather than disk
persistence, so your single instances might lose messages straight out of the box if they’re
not terminated cleanly. You could set flush.messages to 1 though. Don’t forget about Zookeeper
either. That has to go somewhere. 

For what it’s worth I’ve seen one installation move away from this type of pattern as
it was a little painful to manage. Your milage may vary though. But you’re certainly not
alone with wanting to do something like this. There is a buffering producer on the roadmap,
although it may end up being a slightly different thing. 


> On 16 Jan 2016, at 00:12, Jason J. W. Williams <jasonjwwilliams@gmail.com> wrote:
> Hey Luke,
> Thank you for the reply and encouragement. I'm going to start hacking on a
> small PoC.
> -J
> On Fri, Jan 15, 2016 at 12:01 PM, Luke Steensen <
> luke.steensen@braintreepayments.com> wrote:
>> Not an expert, but that sounds like a very reasonable use case for Kafka.
>> The log.retention.* configs on the edge brokers should cover your TTL
>> needs.
>> On Thu, Jan 14, 2016 at 3:37 PM, Jason J. W. Williams <
>> jasonjwwilliams@gmail.com> wrote:
>>> Hello,
>>> We historically have been a RabbitMQ environment, but we're looking at
>>> using Kafka for a new project and I'm wondering if the following
>>> topology/setup would work well in Kafka (for RMQ we'd use federation):
>>> * Multiple remote datacenters consisting each of a single server running
>> an
>>> HTTP application that receives client data and generates events. Each
>>> server would also run single-node Kafka "cluster". The application would
>>> write events as messages into the single-node Kafka "cluster" running on
>>> the same machine.
>>> * A hub datacenter that the remote data centers are connected to via SSL.
>>> The hub data center would run a multi-node Kafka cluster (3 nodes).
>>> * Use mirrormaker in the hub data center to mirror event messages from
>> each
>>> of the remote single-node servers into the hub's central Kafka cluster,
>>> where all of the real consumers are listening.
>>> The problem set is each of the remote servers is collecting data from
>>> customers over HTTP and returning responses, but those remote servers are
>>> also generating events from those customer interactions. We want to
>> publish
>>> those events into a central hub data center for analytics. We want the
>>> event messages at the remote servers to queue up when their network
>>> connections to the hub data center is unreliable, and automatically relay
>>> queued messages to the hub data center when the network comes
>> back...making
>>> the event relay system tolerant to WAN network faults. We'd also want to
>>> set up some kind of TTL on queued messages, so if the WAN connection to
>> the
>>> hub is down for an extended period of time, the messages queued on the
>>> remote servers don't build up infinitely.
>>> Any thoughts on if this setup is advisable/inadvisable with Kafka (or any
>>> other thoughts on it) would be greatly appreciated.
>>> -J

View raw message