kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: users Digest 8 Jun 2013 11:53:55 -0000 Issue 489
Date Sun, 09 Jun 2013 16:26:15 GMT
Hmm, not sure how stable that 3.4.3 version is. Could anyone comment? We
have been using 3.3.4 in the consumer and we haven't observed any ZK
related issues.

Thanks,

Jun


On Sat, Jun 8, 2013 at 4:57 PM, Evan Chan <ev@ooyala.com> wrote:

> Hi guys,
>
> 3.4.3+32-1.cdh4.1.3.p0.26~lucid-cdh4.1.3
>
> It's one of the latest versions of ZK that comes with Cloudera CDH4, I
> believe.
>
> -Evan
>
>
>
> > Which version of ZK are you using?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Fri, Jun 7, 2013 at 12:20 PM, Evan Chan <ev@ooyala.com> wrote:
> >
> > > [ Sorry if this mail is duplicated, this is my fourth try sending this
> > > message]
> > >
> > > Hey guys,
> > >
> > > I sincerely apologize if this has been covered before, I haven't quite
> > > found a similar situation.
> > >
> > > We are using Kafka 0.7.2 in production, and we are using the ZK high
> > level
> > > Scala consumer.   However, we find the ZK consumer very unstable.  It
> > would
> > > work for one or two weeks, then suddenly it would complain about ZK
> nodes
> > > disappearing, and one consumer would die, then another, then another,
> > until
> > > our pipeline is no longer pulling any data.   There are multiple
> > > NullPointerExceptions, and other problems.    We can restart it, but it
> > > does not stay up predictably.
> > >
> > > On the other hand, I have a simple app which I wrote using the simple
> > > consumer to mirror select partitions (will blog about this later) and
> it
> > > just works flawlessly.
> > >
> > > So we are faced with a dilemma to get back on track:
> > > 1)  Use SimpleConsumer, and write our own balancing code  (but honestly
> > our
> > > boxes almost never go down, compared to the rate of ZK mishaps)
> > > 2)  Upgrade to Kafka 0.8 and hope that that resolves the issue.
> > >
> > > There seem to be so many improvements in 0.8 that that seems to be the
> > > biggest win long-term, so I am wondering if people can comment on:
> > > - has anyone tried using 0.8 in production?  Is it stable yet?
> > > - How much more stable is the ZK consumer in 0.8?
> > > - will it be possible to change the offset in the 0.8 consumer?  That
> was
> > > the other reason why we wanted to move to SimpleConsumer.
> > >
> > > thanks,
> > > Evan
> > >
> >
> > Jonathan
> >
> > On Sat, Jun 8, 2013 at 2:09 AM, Jonathan Hodges <hodgesz@gmail.com>
> wrote:
> > > Thanks so much for your replies.  This has been a great help
> > understanding
> > > Rabbit better with having very little experience with it.  I have a few
> > > follow up comments below.
> >
> > Happy to help!
> >
> > I'm afraid I don't follow your arguments below.  Rabbit contains many
> > optimisations too.  I'm told that it is possible to saturate the disk
> > i/o, and you saw the message rates I quoted in the previous email.
> > YES of course there are differences, mostly an accumulation of things.
> >  For example Rabbit spends more time doing work before it writes to
> > disk.
> >
> > You said:
> >
> > "Since Rabbit must maintain the state of the
> > consumers I imagine it’s subjected to random data access patterns on disk
> > as opposed to sequential."
> >
> > I don't follow the logic here, sorry.
> >
> > Couple of side comments:
> >
> > * In your Hadoop vs RT example, Rabbit would deliver the RT messages
> > immediately and write the rest to disk.  It can do this at high rates
> > - I shall try to get you some useful data here.
> >
> > * Bear in mind that write speed should be orthogonal to read speed.
> > Ask yourself - how would Kafka provide a read cache, and when might
> > that be useful?
> >
> > * I'll find out what data structure Rabbit uses for long term
> persistence.
> >
> >
> > "Quoting the Kafka design page (
> > http://kafka.apache.org/07/design.html) performance of sequential writes
> > on
> > a 6 7200rpm SATA RAID-5 array is about 300MB/sec but the performance of
> > random writes is only about 50k/sec—a difference of nearly 10000X."
> >
> > Depending on your use case, I'd expect 2x-10x overall throughput
> > differences, and will try to find out more info.  As I said, Rabbit
> > can saturate disk i/o.
> >
> > alexis
> >
> >
> >
> >
> > >
> > >> While you are correct the payload is a much bigger concern, managing
> the
> > >> metadata and acks centrally on the broker across multiple clients at
> > scale
> > >> is also a concern.  This would seem to be exasperated if you have
> > > consumers
> > >> at different speeds i.e. Storm and Hadoop consuming the same topic.
> > >>
> > >> In that scenario, say storm consumes the topic messages in real-time
> and
> > >> Hadoop consumes once a day.  Let’s assume the topic consists of 100k+
> > >> messages/sec throughput so that in a given day you might have 100s GBs
> > of
> > >> data flowing through the topic.
> > >>
> > >> To allow Hadoop to consume once a day, Rabbit obviously can’t keep
> 100s
> > > GBs
> > >> in memory and will need to persist this data to its internal DB to be
> > >> retrieved later.
> > >
> > > I am not sure why you think this is a problem?
> > >
> > > For a fixed number of producers and consumers, the pubsub and delivery
> > > semantics of Rabbit and Kafka are quite similar.  Think of Rabbit as
> > > adding an in-memory cache that is used to (a) speed up read
> > > consumption, (b) obviate disk writes when possible due to all client
> > > consumers being available and consuming.
> > >
> > >
> > > Actually I think this is the main use case that sets Kafka apart from
> > > Rabbit and speaks to the poster’s ‘Arguments for Kafka over RabbitMQ’
> > > question.  As you mentioned Rabbit is a general purpose messaging
> system
> > > and along with that has a lot of features not found in Kafka.  There
> are
> > > plenty of times when Rabbit makes more sense than Kafka, but not when
> you
> > > are maintaining large message stores and require high throughput to
> disk.
> > >
> > > Persisting 100s GBs of messages to disk is a much different problem
> than
> > > managing messages in memory.  Since Rabbit must maintain the state of
> the
> > > consumers I imagine it’s subjected to random data access patterns on
> disk
> > > as opposed to sequential.  Quoting the Kafka design page (
> > > http://kafka.apache.org/07/design.html) performance of sequential
> > writes on
> > > a 6 7200rpm SATA RAID-5 array is about 300MB/sec but the performance of
> > > random writes is only about 50k/sec—a difference of nearly 10000X.
> > >
> > > They go on to say persistent data structure used in messaging systems
> > > metadata is often a BTree. BTrees are the most versatile data structure
> > > available, and make it possible to support a wide variety of
> > transactional
> > > and non-transactional semantics in the messaging system. They do come
> > with
> > > a fairly high cost, though: Btree operations are O(log N). Normally
> O(log
> > > N) is considered essentially equivalent to constant time, but this is
> not
> > > true for disk operations. Disk seeks come at 10 ms a pop, and each disk
> > can
> > > do only one seek at a time so parallelism is limited. Hence even a
> > handful
> > > of disk seeks leads to very high overhead. Since storage systems mix
> very
> > > fast cached operations with actual physical disk operations, the
> observed
> > > performance of tree structures is often superlinear. Furthermore BTrees
> > > require a very sophisticated page or row locking implementation to
> avoid
> > > locking the entire tree on each operation. The implementation must pay
> a
> > > fairly high price for row-locking or else effectively serialize all
> > reads.
> > > Because of the heavy reliance on disk seeks it is not possible to
> > > effectively take advantage of the improvements in drive density, and
> one
> > is
> > > forced to use small (< 100GB) high RPM SAS drives to maintain a sane
> > ratio
> > > of data to seek capacity.
> > >
> > > Intuitively a persistent queue could be built on simple reads and
> appends
> > > to files as is commonly the case with logging solutions. Though this
> > > structure would not support the rich semantics of a BTree
> implementation,
> > > but it has the advantage that all operations are O(1) and reads do not
> > > block writes or each other. This has obvious performance advantages
> since
> > > the performance is completely decoupled from the data size--one server
> > can
> > > now take full advantage of a number of cheap, low-rotational speed 1+TB
> > > SATA drives. Though they have poor seek performance, these drives often
> > > have comparable performance for large reads and writes at 1/3 the price
> > and
> > > 3x the capacity.
> > >
> > > Having access to virtually unlimited disk space without penalty means
> > that
> > > we can provide some features not usually found in a messaging system.
> For
> > > example, in kafka, instead of deleting a message immediately after
> > > consumption, we can retain messages for a relative long period (say a
> > week).
> > >
> > > Our assumption is that the volume of messages is extremely high, indeed
> > it
> > > is some multiple of the total number of page views for the site (since
> a
> > > page view is one of the activities we process). Furthermore we assume
> > each
> > > message published is read at least once (and often multiple times),
> hence
> > > we optimize for consumption rather than production.
> > >
> > > There are two common causes of inefficiency: too many network requests,
> > and
> > > excessive byte copying.
> > >
> > > To encourage efficiency, the APIs are built around a "message set"
> > > abstraction that naturally groups messages. This allows network
> requests
> > to
> > > group messages together and amortize the overhead of the network
> > roundtrip
> > > rather than sending a single message at a time.
> > >
> > > The MessageSet implementation is itself a very thin API that wraps a
> byte
> > > array or file. Hence there is no separate serialization or
> > deserialization
> > > step required for message processing, message fields are lazily
> > > deserialized as needed (or not deserialized if not needed).
> > >
> > > The message log maintained by the broker is itself just a directory of
> > > message sets that have been written to disk. This abstraction allows a
> > > single byte format to be shared by both the broker and the consumer
> (and
> > to
> > > some degree the producer, though producer messages are checksumed and
> > > validated before being added to the log).
> > >
> > > Maintaining this common format allows optimization of the most
> important
> > > operation: network transfer of persistent log chunks. Modern unix
> > operating
> > > systems offer a highly optimized code path for transferring data out of
> > > pagecache to a socket; in Linux this is done with the sendfile system
> > call.
> > > Java provides access to this system call with the
> FileChannel.transferTo
> > > api.
> > >
> > > To understand the impact of sendfile, it is important to understand the
> > > common data path for transfer of data from file to socket:
> > >
> > >   1. The operating system reads data from the disk into pagecache in
> > kernel
> > > space
> > >   2. The application reads the data from kernel space into a user-space
> > > buffer
> > >   3. The application writes the data back into kernel space into a
> socket
> > > buffer
> > >   4. The operating system copies the data from the socket buffer to the
> > NIC
> > > buffer where it is sent over the network
> > >
> > > This is clearly inefficient, there are four copies, two system calls.
> > Using
> > > sendfile, this re-copying is avoided by allowing the OS to send the
> data
> > > from pagecache to the network directly. So in this optimized path, only
> > the
> > > final copy to the NIC buffer is needed.
> > >
> > > We expect a common use case to be multiple consumers on a topic. Using
> > the
> > > zero-copy optimization above, data is copied into pagecache exactly
> once
> > > and reused on each consumption instead of being stored in memory and
> > copied
> > > out to kernel space every time it is read. This allows messages to be
> > > consumed at a rate that approaches the limit of the network connection.
> > >
> > >
> > > So in the end it would seem Kafka’s specialized nature to write data
> > first
> > > really shines over Rabbit when your use case requires a very high
> > > throughput unblocking firehose with large data persistence to disk.
> >  Since
> > > this is only one use case this by no means is saying Kafka is better
> than
> > > Rabbit or vice versa.  I think it is awesome there are more options to
> > > choose from so you can pick the right tool for the job.  Thanks open
> > source!
> > >
> > > As always YMMV.
> > >
> > >
> > >
> > > On Fri, Jun 7, 2013 at 4:40 PM, Alexis Richardson <
> > > alexis.richardson@gmail.com> wrote:
> > >
> > >> Jonathan,
> > >>
> > >>
> > >> On Fri, Jun 7, 2013 at 7:03 PM, Jonathan Hodges <hodgesz@gmail.com>
> > wrote:
> > >> > Hi Alexis,
> > >> >
> > >> > I appreciate your reply and clarifications to my misconception about
> > >> > Rabbit, particularly on the copying of the message payloads per
> > consumer.
> > >>
> > >> Thank-you!
> > >>
> > >>
> > >> >  It sounds like it only copies metadata like the consumer state i.e.
> > >> > position in the topic messages.
> > >>
> > >> Basically yes.  Of course when a message is delivered to N>1
> > >> *machines*, then there will be N copies, one per machine.
> > >>
> > >> Also, for various reasons, very tiny (<60b) messages do get copied as
> > >> you'd assumed.
> > >>
> > >>
> > >> > I don’t have experience with Rabbit and
> > >> > was basing this assumption based on Google searches like the
> > following -
> > >> >
> > >>
> >
> http://ilearnstack.com/2013/04/16/introduction-to-amqp-messaging-with-rabbitmq/
> > >> .
> > >> >  It seems to indicate with topic exchanges that the messages get
> > copied
> > >> to
> > >> > a queue per consumer, but I am glad you confirmed it is just the
> > >> metadata.
> > >>
> > >> Yup.
> > >>
> > >> That's a fairly decent article but even the good stuff uses words like
> > >> "copy" without a fixed denotation.  Don't believe the internets!
> > >>
> > >>
> > >> > While you are correct the payload is a much bigger concern, managing
> > the
> > >> > metadata and acks centrally on the broker across multiple clients at
> > >> scale
> > >> > is also a concern.  This would seem to be exasperated if you have
> > >> consumers
> > >> > at different speeds i.e. Storm and Hadoop consuming the same topic.
> > >> >
> > >> > In that scenario, say storm consumes the topic messages in real-time
> > and
> > >> > Hadoop consumes once a day.  Let’s assume the topic consists of
> 100k+
> > >> > messages/sec throughput so that in a given day you might have 100s
> > GBs of
> > >> > data flowing through the topic.
> > >> >
> > >> > To allow Hadoop to consume once a day, Rabbit obviously can’t keep
> > 100s
> > >> GBs
> > >> > in memory and will need to persist this data to its internal DB to
> be
> > >> > retrieved later.
> > >>
> > >> I am not sure why you think this is a problem?
> > >>
> > >> For a fixed number of producers and consumers, the pubsub and delivery
> > >> semantics of Rabbit and Kafka are quite similar.  Think of Rabbit as
> > >> adding an in-memory cache that is used to (a) speed up read
> > >> consumption, (b) obviate disk writes when possible due to all client
> > >> consumers being available and consuming.
> > >>
> > >>
> > >> > I believe when large amounts of data need to be persisted
> > >> > is the scenario described in the earlier posted Kafka paper (
> > >> >
> > >>
> >
> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
> > >> )
> > >> > where Rabbit’s performance really starts to bog down as compared to
> > >> Kafka.
> > >>
> > >> Not sure what parts of the paper you mean?
> > >>
> > >> I read that paper when it came out.  I found it strongest when
> > >> describing Kafka's design philosophy.  I found the performance
> > >> statements made about Rabbit pretty hard to understand.  This is not
> > >> meant to be a criticism of the authors!  I have seen very few
> > >> performance papers about messaging that I would base decisions on.
> > >>
> > >>
> > >> > This Kafka paper is looks to be a few years old
> > >>
> > >> Um....  Lots can change in technology very quickly :-)
> > >>
> > >> Eg.: At the time this paper was published, Instagram had 5m users.
> > >> Six months earlier in Dec 2010, it had 1m.  Since then it grew huge
> > >> and got acquired.
> > >>
> > >>
> > >>
> > >> > so has something changed
> > >> > within the Rabbit architecture to alleviate this issue when large
> > amounts
> > >> > of data are persisted to the internal DB?
> > >>
> > >> Rabbit introduced a new internal flow control system which impacted
> > >> performance under steady load.  This may be relevant?  I couldn't say
> > >> from reading the paper.
> > >>
> > >> I don't have a good reference for this to hand, but here is a post
> > >> about external flow control that you may find amusing:
> > >>
> > >>
> >
> http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/
> > >>
> > >>
> > >> > Do the producer and consumer
> > >> > numbers look correct?  If no, maybe you can share some Rabbit
> > benchmarks
> > >> > under this scenario, because I believe it is the main area where
> Kafka
> > >> > appears to be the superior solution.
> > >>
> > >> This is from about one year ago:
> > >>
> > >>
> >
> http://www.rabbitmq.com/blog/2012/04/25/rabbitmq-performance-measurements-part-2/
> > >>
> > >> Obviously none of this uses batching, which is an easy trick for
> > >> increasing throughput.
> > >>
> > >> YMMV.
> > >>
> > >> Is this helping?
> > >>
> > >> alexis
> > >>
> > >>
> > >>
> > >> > Thanks for educating me on these matters.
> > >> >
> > >> > -Jonathan
> > >> >
> > >> >
> > >> >
> > >> > On Fri, Jun 7, 2013 at 6:54 AM, Alexis Richardson <
> > alexis@rabbitmq.com
> > >> >wrote:
> > >> >
> > >> >> Hi
> > >> >>
> > >> >> Alexis from Rabbit here.  I hope I am not intruding!
> > >> >>
> > >> >> It would be super helpful if people with questions, observations or
> > >> >> moans posted them to the rabbitmq list too :-)
> > >> >>
> > >> >> A few comments:
> > >> >>
> > >> >> * Along with ZeroMQ, I consider Kafka to be one of the interesting
> > and
> > >> >> useful messaging projects out there.  In a world of cruft, Kafka is
> > >> >> cool!
> > >> >>
> > >> >> * This is because both projects come at messaging from a specific
> > >> >> point of view that is *different* from Rabbit.  OTOH, many other
> > >> >> projects exist that replicate Rabbit features for fun, or NIH, or
> due
> > >> >> to misunderstanding the semantics (yes, our docs could be better)
> > >> >>
> > >> >> * It is striking how few people describe those differences.  In a
> > >> >> nutshell they are as follows:
> > >> >>
> > >> >> *** Kafka writes all incoming data to disk immediately, and then
> > >> >> figures out who sees what.  So it is much more like a database than
> > >> >> Rabbit, in that new consumers can appear well after the disk write
> > and
> > >> >> still subscribe to past messages.  Instead, Rabbit which tries to
> > >> >> deliver to consumers and buffers otherwise.  Persistence is
> optional
> > >> >> but robust and a feature of the buffer ("queue") not the upstream
> > >> >> machinery.  Rabbit is able to cache-on-arrival via a plugin, but
> this
> > >> >> is a design overlay and not particularly optimal.
> > >> >>
> > >> >> *** Kafka is a client server system with end to end semantics.  It
> > >> >> defines order to include processing order, and keeps state on the
> > >> >> client to do this.  Group management is via a 3rd party service
> > >> >> (Zookeeper? I forget which).  Rabbit is a server-only protocol
> based
> > >> >> system which maintains order on the server and through completely
> > >> >> language neutral protocol semantics.  This makes Rabbit perhaps
> more
> > >> >> natural as a 'messaging service' eg for integration and other
> > >> >> inter-app data transfer.
> > >> >>
> > >> >> *** Rabbit is a general purpose messaging system with extras like
> > >> >> federation.  It speaks many protocols, and has core features like
> HA,
> > >> >> transactions, management, etc.  Everything can be switched on or
> off.
> > >> >> Getting all this to work while keeping the install light and fast,
> is
> > >> >> quite fiddly.  Kafka by contrast comes from a specific set of use
> > >> >> cases, which are interesting certainly.  I am not sure if Kafka
> wants
> > >> >> to be a general purpose messaging system, but it will become a bit
> > >> >> more like Rabbit if that is the goal.
> > >> >>
> > >> >> *** Both approaches have costs.  In the case of Rabbit the cost is
> > >> >> that more metadata is stored on the broker.  Kafka can get
> > performance
> > >> >> gains by storing less such data.  But we are talking about some N
> > >> >> thousands of MPS versus some M thousands.  At those speeds the
> > clients
> > >> >> are usually the bottleneck anyway.
> > >> >>
> > >> >> * Let me also clarify some things:
> > >> >>
> > >> >> *** Rabbit does NOT store multiple copies of the same message
> across
> > >> >> queues, unless they are very small (<60b, iirc).  A message
> delivered
> > >> >> to >1 queue on 1 machine is stored once.  Metadata about that
> message
> > >> >> may be stored more than once, but, at scale, the big cost is the
> > >> >> payload.
> > >> >>
> > >> >> *** Rabbit's vanilla install does store some index data in memory
> > when
> > >> >> messages flow to disk.  You can change this by using a plugin, but
> > >> >> this is a secret-menu undocumented feature.  Very very few people
> > need
> > >> >> any such thing.
> > >> >>
> > >> >> *** A Rabbit queue is lightweight.  It's just an ordered
> consumption
> > >> >> buffer that can persist and ack.  Don't assume things about Rabbit
> > >> >> queues based on what you know about IBM MQ, JMS, and so forth.
> >  Queues
> > >> >> in Rabbit and Kafka are not the same.
> > >> >>
> > >> >> *** Rabbit does not use mnesia for message storage.  It has its own
> > >> >> DB, optimised for messaging.  You can use other DBs but this is
> > >> >> Complicated.
> > >> >>
> > >> >> *** Rabbit does all kinds of batching and bulk processing, and can
> > >> >> batch end to end.  If you see claims about batching, buffering,
> etc.,
> > >> >> find out ALL the details before drawing conclusions.
> > >> >>
> > >> >> I hope this is helpful.
> > >> >>
> > >> >> Keen to get feedback / questions / corrections.
> > >> >>
> > >> >> alexis
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Fri, Jun 7, 2013 at 2:09 AM, Marc Labbe <mrlabbe@gmail.com>
> > wrote:
> > >> >> > We also went through the same decision making and our arguments
> for
> > >> Kafka
> > >> >> > where in the same lines as those Jonathan mentioned. The fact
> that
> > we
> > >> >> have
> > >> >> > heterogeneous consumers is really a deciding factor. Our
> > requirements
> > >> >> were
> > >> >> > to avoid loosing messages at all cost while having multiple
> > consumers
> > >> >> > reading the same data at a different pace. On one side, we have a
> > few
> > >> >> > consumers being fed with data coming in from most, if not all,
> > >> topics. On
> > >> >> > the other side, we have a good bunch of consumers reading only
> > from a
> > >> >> > single topic. The big guys can take their time to read while the
> > >> smaller
> > >> >> > ones are mostly for near real-time events so they need to keep up
> > the
> > >> >> pace
> > >> >> > of incoming messages.
> > >> >> >
> > >> >> > RabbitMQ stores data on disk only if you tell it to while Kafka
> > >> persists
> > >> >> by
> > >> >> > design. From the beginning, we decided we would try to use the
> > queues
> > >> the
> > >> >> > same way, pub/sub with a routing key (an exchange in RabbitMQ) or
> > >> topic,
> > >> >> > persisted to disk and replicated.
> > >> >> >
> > >> >> > One of our scenario was to see how the system would cope with the
> > >> largest
> > >> >> > consumer down for a while, therefore forcing the brokers to keep
> > the
> > >> data
> > >> >> > for a long period. In the case of RabbitMQ, this consumer has it
> > owns
> > >> >> queue
> > >> >> > and data grows on disk, which is not really a problem if you plan
> > >> >> > consequently. But, since it has to keep track of all messages
> read,
> > >> the
> > >> >> > Mnesia database used by RabbitMQ as the messages index also grows
> > >> pretty
> > >> >> > big. At that point, the amount of RAM necessary becomes very
> large
> > to
> > >> >> keep
> > >> >> > the level of performance we need. In our tests, we found that
> this
> > an
> > >> >> > adverse effect on ALL the brokers, thus affecting all consumers.
> > You
> > >> can
> > >> >> > always say that you'll monitor the consumers to make sure it
> won't
> > >> >> happen.
> > >> >> > That's a good thing if you can. I wasn't ready to make that bet.
> > >> >> >
> > >> >> > Another point is the fact that, since we wanted to use pub/sub
> > with a
> > >> >> > exchange in RabbitMQ, we would have ended up with a lot data
> > >> duplication
> > >> >> > because if a message is read by multiple consumers, it will get
> > >> >> duplicated
> > >> >> > in the queue of each of those consumer. Kafka wins on that side
> too
> > >> since
> > >> >> > every consumer reads from the same source.
> > >> >> >
> > >> >> > The downsides of Kafka were the language issues (we are using
> > mostly
> > >> >> Python
> > >> >> > and C#). 0.8 is very new and few drivers are available at this
> > point.
> > >> >> Also,
> > >> >> > we will have to try getting as close as possible to
> > once-and-only-once
> > >> >> > guarantee. There are two things where RabbitMQ would have given
> us
> > >> less
> > >> >> > work out of the box as opposed to Kafka. RabbitMQ also provides a
> > >> bunch
> > >> >> of
> > >> >> > tools that makes it rather attractive too.
> > >> >> >
> > >> >> > In the end, looking at throughput is a pretty nifty thing but
> being
> > >> sure
> > >> >> > that I'll be able to manage the beast as it grows will allow me
> to
> > >> get to
> > >> >> > sleep way more easily.
> > >> >> >
> > >> >> >
> > >> >> > On Thu, Jun 6, 2013 at 3:28 PM, Jonathan Hodges <
> hodgesz@gmail.com
> > >
> > >> >> wrote:
> > >> >> >
> > >> >> >> We just went through a similar exercise with RabbitMQ at our
> > company
> > >> >> with
> > >> >> >> streaming activity data from our various web properties.  Our
> use
> > >> case
> > >> >> >> requires consumption of this stream by many heterogeneous
> > consumers
> > >> >> >> including batch (Hadoop) and real-time (Storm).  We pointed out
> > that
> > >> >> Kafka
> > >> >> >> acts as a configurable rolling window of time on the activity
> > stream.
> > >> >>  The
> > >> >> >> window default is 7 days which allows for supporting clients of
> > >> >> different
> > >> >> >> latencies like Hadoop and Storm to read from the same stream.
> > >> >> >>
> > >> >> >> We pointed out that the Kafka brokers don't need to maintain
> > consumer
> > >> >> state
> > >> >> >> in the stream and only have to maintain one copy of the stream
> to
> > >> >> support N
> > >> >> >> number of consumers.  Rabbit brokers on the other hand have to
> > >> maintain
> > >> >> the
> > >> >> >> state of each consumer as well as create a copy of the stream
> for
> > >> each
> > >> >> >> consumer.  In our scenario we have 10-20 consumers and with the
> > scale
> > >> >> and
> > >> >> >> throughput of the activity stream we were able to show Rabbit
> > quickly
> > >> >> >> becomes the bottleneck under load.
> > >> >> >>
> > >> >> >>
> > >> >> >>
> > >> >> >> On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu <
> > >> >> >> Dragos.Manolescu@servicenow.com> wrote:
> > >> >> >>
> > >> >> >> > Hi --
> > >> >> >> >
> > >> >> >> > I am preparing to make a case for using Kafka instead of
> Rabbit
> > MQ
> > >> as
> > >> >> a
> > >> >> >> > broker-based messaging provider. The context is similar to
> that
> > of
> > >> the
> > >> >> >> > Kafka papers and user stories: the producers publish
> monitoring
> > >> data
> > >> >> and
> > >> >> >> > logs, and a suite of subscribers consume this data (some store
> > it,
> > >> >> others
> > >> >> >> > perform computations on the event stream). The requirements
> are
> > >> >> typical
> > >> >> >> of
> > >> >> >> > this context: low-latency, high-throughput, ability to deal
> with
> > >> >> bursts
> > >> >> >> and
> > >> >> >> > operate in/across multiple data centers, etc.
> > >> >> >> >
> > >> >> >> > I am familiar with the performance comparison between Kafka,
> > >> Rabbit MQ
> > >> >> >> and
> > >> >> >> > Active MQ from the NetDB 2011 paper<
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
> > >> >> >> >.
> > >> >> >> > However in the two years that passed since then the number of
> > >> >> production
> > >> >> >> > Kafka installations increased, and people are using it in
> > different
> > >> >> ways
> > >> >> >> > than those imagined by Kafka's designers. In light of these
> > >> >> experiences
> > >> >> >> one
> > >> >> >> > can use more data points and color when contrasting to Rabbit
> MQ
> > >> >> (which
> > >> >> >> by
> > >> >> >> > the way also evolved since 2011). (And FWIW I know I am not
> the
> > >> first
> > >> >> one
> > >> >> >> > to walk this path; see for example last year's OSCON session
> on
> > the
> > >> >> State
> > >> >> >> > of MQ<http://lanyrd.com/2012/oscon/swrcz/>.)
> > >> >> >> >
> > >> >> >> > I would appreciate it if you could share measurements,
> results,
> > or
> > >> >> even
> > >> >> >> > anecdotal evidence along these lines. How have you avoided the
> > >> "let's
> > >> >> use
> > >> >> >> > Rabbit MQ because everybody else does it" route when solving
> > >> problems
> > >> >> for
> > >> >> >> > which Kafka is a better fit?
> > >> >> >> >
> > >> >> >> > Thanks,
> > >> >> >> >
> > >> >> >> > -Dragos
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> >
> > On Sat, Jun 8, 2013 at 2:09 AM, Jonathan Hodges <hodgesz@gmail.com>
> wrote:
> > > Thanks so much for your replies.  This has been a great help
> > understanding
> > > Rabbit better with having very little experience with it.  I have a few
> > > follow up comments below.
> >
> > Happy to help!
> >
> > I'm afraid I don't follow your arguments below.  Rabbit contains many
> > optimisations too.  I'm told that it is possible to saturate the disk
> > i/o, and you saw the message rates I quoted in the previous email.
> > YES of course there are differences, mostly an accumulation of things.
> >  For example Rabbit spends more time doing work before it writes to
> > disk.
> >
> > It would be great if you can you detail some of the optimizations?  It
> > would seem to me Rabbit has much more overhead due to maintaining state
> of
> > the consumers as well as general messaging processing which makes it
> > impossible to manage the same write throughput as Kafka when you need to
> > persist large amounts of data to disk.  I definitely believe you that
> > Rabbit can saturate the disk but it is much more seek centric i.e. random
> > access read/writes vs sequential read/writes.  Kafka saturates the disk
> > too, but since it leverages sequential disk I/O is orders of magnitude
> more
> > efficient persisting to disk than random access.
> >
> >
> > You said:
> >
> > "Since Rabbit must maintain the state of the
> > consumers I imagine it’s subjected to random data access patterns on disk
> > as opposed to sequential."
> >
> > I don't follow the logic here, sorry.
> >
> > Couple of side comments:
> >
> > * In your Hadoop vs RT example, Rabbit would deliver the RT messages
> > immediately and write the rest to disk.  It can do this at high rates
> > - I shall try to get you some useful data here.
> >
> > * Bear in mind that write speed should be orthogonal to read speed.
> > Ask yourself - how would Kafka provide a read cache, and when might
> > that be useful?
> >
> > * I'll find out what data structure Rabbit uses for long term
> persistence.
> >
> > What I am saying here is when Rabbit needs to retrieve and persist each
> > consumer’s state from its internal DB this information isn’t linearly
> > persisted on disk so it requires disk seeks which is in much less
> > inefficient than sequential access.  You do get the difference here,
> > correct?  Sequential reads from disk are nearly 1.5x faster than random
> > reads from memory and 4-5 orders of magnitude faster than random reads
> from
> > disk (http://queue.acm.org/detail.cfm?id=1563874).
> >
> > As was detailed at length in my previous post Kafka uses the OS
> > pagecache/sendfile which is much more efficient than memory or
> applications
> > cache.
> >
> > That would be awesome if you can confirm what Rabbit is using as a
> > persistent data structure.  More importantly, whether it is BTree or
> > something else, is the disk i/o random or linear?
> >
> >
> > "Quoting the Kafka design page (
> > http://kafka.apache.org/07/design.html) performance of sequential writes
> > on
> > a 6 7200rpm SATA RAID-5 array is about 300MB/sec but the performance of
> > random writes is only about 50k/sec—a difference of nearly 10000X."
> >
> > Depending on your use case, I'd expect 2x-10x overall throughput
> > differences, and will try to find out more info.  As I said, Rabbit
> > can saturate disk i/o.
> >
> > This is only speaking of the use case of high throughput with persisting
> > large amounts of data to disk where there is 4 orders of magnitude more
> > than 10x difference.  It all comes down to random vs sequential
> > writes/reads to disk as I mentioned above.
> >
> >
> > On Sat, Jun 8, 2013 at 2:07 AM, Alexis Richardson <
> > alexis.richardson@gmail.com> wrote:
> >
> > > Jonathan
> > >
> > > On Sat, Jun 8, 2013 at 2:09 AM, Jonathan Hodges <hodgesz@gmail.com>
> > wrote:
> > > > Thanks so much for your replies.  This has been a great help
> > > understanding
> > > > Rabbit better with having very little experience with it.  I have a
> few
> > > > follow up comments below.
> > >
> > > Happy to help!
> > >
> > > I'm afraid I don't follow your arguments below.  Rabbit contains many
> > > optimisations too.  I'm told that it is possible to saturate the disk
> > > i/o, and you saw the message rates I quoted in the previous email.
> > > YES of course there are differences, mostly an accumulation of things.
> > >  For example Rabbit spends more time doing work before it writes to
> > > disk.
> > >
> > > You said:
> > >
> > > "Since Rabbit must maintain the state of the
> > > consumers I imagine it’s subjected to random data access patterns on
> disk
> > > as opposed to sequential."
> > >
> > > I don't follow the logic here, sorry.
> > >
> > > Couple of side comments:
> > >
> > > * In your Hadoop vs RT example, Rabbit would deliver the RT messages
> > > immediately and write the rest to disk.  It can do this at high rates
> > > - I shall try to get you some useful data here.
> > >
> > > * Bear in mind that write speed should be orthogonal to read speed.
> > > Ask yourself - how would Kafka provide a read cache, and when might
> > > that be useful?
> > >
> > > * I'll find out what data structure Rabbit uses for long term
> > persistence.
> > >
> > >
> > > "Quoting the Kafka design page (
> > > http://kafka.apache.org/07/design.html) performance of sequential
> writes
> > > on
> > > a 6 7200rpm SATA RAID-5 array is about 300MB/sec but the performance of
> > > random writes is only about 50k/sec—a difference of nearly 10000X."
> > >
> > > Depending on your use case, I'd expect 2x-10x overall throughput
> > > differences, and will try to find out more info.  As I said, Rabbit
> > > can saturate disk i/o.
> > >
> > > alexis
> > >
> > >
> > >
> > >
> > > >
> > > >> While you are correct the payload is a much bigger concern, managing
> > the
> > > >> metadata and acks centrally on the broker across multiple clients at
> > > scale
> > > >> is also a concern.  This would seem to be exasperated if you have
> > > > consumers
> > > >> at different speeds i.e. Storm and Hadoop consuming the same topic.
> > > >>
> > > >> In that scenario, say storm consumes the topic messages in real-time
> > and
> > > >> Hadoop consumes once a day.  Let’s assume the topic consists of
> 100k+
> > > >> messages/sec throughput so that in a given day you might have 100s
> GBs
> > > of
> > > >> data flowing through the topic.
> > > >>
> > > >> To allow Hadoop to consume once a day, Rabbit obviously can’t keep
> > 100s
> > > > GBs
> > > >> in memory and will need to persist this data to its internal DB to
> be
> > > >> retrieved later.
> > > >
> > > > I am not sure why you think this is a problem?
> > > >
> > > > For a fixed number of producers and consumers, the pubsub and
> delivery
> > > > semantics of Rabbit and Kafka are quite similar.  Think of Rabbit as
> > > > adding an in-memory cache that is used to (a) speed up read
> > > > consumption, (b) obviate disk writes when possible due to all client
> > > > consumers being available and consuming.
> > > >
> > > >
> > > > Actually I think this is the main use case that sets Kafka apart from
> > > > Rabbit and speaks to the poster’s ‘Arguments for Kafka over RabbitMQ’
> > > > question.  As you mentioned Rabbit is a general purpose messaging
> > system
> > > > and along with that has a lot of features not found in Kafka.  There
> > are
> > > > plenty of times when Rabbit makes more sense than Kafka, but not when
> > you
> > > > are maintaining large message stores and require high throughput to
> > disk.
> > > >
> > > > Persisting 100s GBs of messages to disk is a much different problem
> > than
> > > > managing messages in memory.  Since Rabbit must maintain the state of
> > the
> > > > consumers I imagine it’s subjected to random data access patterns on
> > disk
> > > > as opposed to sequential.  Quoting the Kafka design page (
> > > > http://kafka.apache.org/07/design.html) performance of sequential
> > > writes on
> > > > a 6 7200rpm SATA RAID-5 array is about 300MB/sec but the performance
> of
> > > > random writes is only about 50k/sec—a difference of nearly 10000X.
> > > >
> > > > They go on to say persistent data structure used in messaging systems
> > > > metadata is often a BTree. BTrees are the most versatile data
> structure
> > > > available, and make it possible to support a wide variety of
> > > transactional
> > > > and non-transactional semantics in the messaging system. They do come
> > > with
> > > > a fairly high cost, though: Btree operations are O(log N). Normally
> > O(log
> > > > N) is considered essentially equivalent to constant time, but this is
> > not
> > > > true for disk operations. Disk seeks come at 10 ms a pop, and each
> disk
> > > can
> > > > do only one seek at a time so parallelism is limited. Hence even a
> > > handful
> > > > of disk seeks leads to very high overhead. Since storage systems mix
> > very
> > > > fast cached operations with actual physical disk operations, the
> > observed
> > > > performance of tree structures is often superlinear. Furthermore
> BTrees
> > > > require a very sophisticated page or row locking implementation to
> > avoid
> > > > locking the entire tree on each operation. The implementation must
> pay
> > a
> > > > fairly high price for row-locking or else effectively serialize all
> > > reads.
> > > > Because of the heavy reliance on disk seeks it is not possible to
> > > > effectively take advantage of the improvements in drive density, and
> > one
> > > is
> > > > forced to use small (< 100GB) high RPM SAS drives to maintain a sane
> > > ratio
> > > > of data to seek capacity.
> > > >
> > > > Intuitively a persistent queue could be built on simple reads and
> > appends
> > > > to files as is commonly the case with logging solutions. Though this
> > > > structure would not support the rich semantics of a BTree
> > implementation,
> > > > but it has the advantage that all operations are O(1) and reads do
> not
> > > > block writes or each other. This has obvious performance advantages
> > since
> > > > the performance is completely decoupled from the data size--one
> server
> > > can
> > > > now take full advantage of a number of cheap, low-rotational speed
> 1+TB
> > > > SATA drives. Though they have poor seek performance, these drives
> often
> > > > have comparable performance for large reads and writes at 1/3 the
> price
> > > and
> > > > 3x the capacity.
> > > >
> > > > Having access to virtually unlimited disk space without penalty means
> > > that
> > > > we can provide some features not usually found in a messaging system.
> > For
> > > > example, in kafka, instead of deleting a message immediately after
> > > > consumption, we can retain messages for a relative long period (say a
> > > week).
> > > >
> > > > Our assumption is that the volume of messages is extremely high,
> indeed
> > > it
> > > > is some multiple of the total number of page views for the site
> (since
> > a
> > > > page view is one of the activities we process). Furthermore we assume
> > > each
> > > > message published is read at least once (and often multiple times),
> > hence
> > > > we optimize for consumption rather than production.
> > > >
> > > > There are two common causes of inefficiency: too many network
> requests,
> > > and
> > > > excessive byte copying.
> > > >
> > > > To encourage efficiency, the APIs are built around a "message set"
> > > > abstraction that naturally groups messages. This allows network
> > requests
> > > to
> > > > group messages together and amortize the overhead of the network
> > > roundtrip
> > > > rather than sending a single message at a time.
> > > >
> > > > The MessageSet implementation is itself a very thin API that wraps a
> > byte
> > > > array or file. Hence there is no separate serialization or
> > > deserialization
> > > > step required for message processing, message fields are lazily
> > > > deserialized as needed (or not deserialized if not needed).
> > > >
> > > > The message log maintained by the broker is itself just a directory
> of
> > > > message sets that have been written to disk. This abstraction allows
> a
> > > > single byte format to be shared by both the broker and the consumer
> > (and
> > > to
> > > > some degree the producer, though producer messages are checksumed and
> > > > validated before being added to the log).
> > > >
> > > > Maintaining this common format allows optimization of the most
> > important
> > > > operation: network transfer of persistent log chunks. Modern unix
> > > operating
> > > > systems offer a highly optimized code path for transferring data out
> of
> > > > pagecache to a socket; in Linux this is done with the sendfile system
> > > call.
> > > > Java provides access to this system call with the
> > FileChannel.transferTo
> > > > api.
> > > >
> > > > To understand the impact of sendfile, it is important to understand
> the
> > > > common data path for transfer of data from file to socket:
> > > >
> > > >   1. The operating system reads data from the disk into pagecache in
> > > kernel
> > > > space
> > > >   2. The application reads the data from kernel space into a
> user-space
> > > > buffer
> > > >   3. The application writes the data back into kernel space into a
> > socket
> > > > buffer
> > > >   4. The operating system copies the data from the socket buffer to
> the
> > > NIC
> > > > buffer where it is sent over the network
> > > >
> > > > This is clearly inefficient, there are four copies, two system calls.
> > > Using
> > > > sendfile, this re-copying is avoided by allowing the OS to send the
> > data
> > > > from pagecache to the network directly. So in this optimized path,
> only
> > > the
> > > > final copy to the NIC buffer is needed.
> > > >
> > > > We expect a common use case to be multiple consumers on a topic.
> Using
> > > the
> > > > zero-copy optimization above, data is copied into pagecache exactly
> > once
> > > > and reused on each consumption instead of being stored in memory and
> > > copied
> > > > out to kernel space every time it is read. This allows messages to be
> > > > consumed at a rate that approaches the limit of the network
> connection.
> > > >
> > > >
> > > > So in the end it would seem Kafka’s specialized nature to write data
> > > first
> > > > really shines over Rabbit when your use case requires a very high
> > > > throughput unblocking firehose with large data persistence to disk.
> > >  Since
> > > > this is only one use case this by no means is saying Kafka is better
> > than
> > > > Rabbit or vice versa.  I think it is awesome there are more options
> to
> > > > choose from so you can pick the right tool for the job.  Thanks open
> > > source!
> > > >
> > > > As always YMMV.
> > > >
> > > >
> > > >
> > > > On Fri, Jun 7, 2013 at 4:40 PM, Alexis Richardson <
> > > > alexis.richardson@gmail.com> wrote:
> > > >
> > > >> Jonathan,
> > > >>
> > > >>
> > > >> On Fri, Jun 7, 2013 at 7:03 PM, Jonathan Hodges <hodgesz@gmail.com>
> > > wrote:
> > > >> > Hi Alexis,
> > > >> >
> > > >> > I appreciate your reply and clarifications to my misconception
> about
> > > >> > Rabbit, particularly on the copying of the message payloads per
> > > consumer.
> > > >>
> > > >> Thank-you!
> > > >>
> > > >>
> > > >> >  It sounds like it only copies metadata like the consumer state
> i.e.
> > > >> > position in the topic messages.
> > > >>
> > > >> Basically yes.  Of course when a message is delivered to N>1
> > > >> *machines*, then there will be N copies, one per machine.
> > > >>
> > > >> Also, for various reasons, very tiny (<60b) messages do get copied
> as
> > > >> you'd assumed.
> > > >>
> > > >>
> > > >> > I don’t have experience with Rabbit and
> > > >> > was basing this assumption based on Google searches like the
> > > following -
> > > >> >
> > > >>
> > >
> >
> http://ilearnstack.com/2013/04/16/introduction-to-amqp-messaging-with-rabbitmq/
> > > >> .
> > > >> >  It seems to indicate with topic exchanges that the messages get
> > > copied
> > > >> to
> > > >> > a queue per consumer, but I am glad you confirmed it is just the
> > > >> metadata.
> > > >>
> > > >> Yup.
> > > >>
> > > >> That's a fairly decent article but even the good stuff uses words
> like
> > > >> "copy" without a fixed denotation.  Don't believe the internets!
> > > >>
> > > >>
> > > >> > While you are correct the payload is a much bigger concern,
> managing
> > > the
> > > >> > metadata and acks centrally on the broker across multiple clients
> at
> > > >> scale
> > > >> > is also a concern.  This would seem to be exasperated if you have
> > > >> consumers
> > > >> > at different speeds i.e. Storm and Hadoop consuming the same
> topic.
> > > >> >
> > > >> > In that scenario, say storm consumes the topic messages in
> real-time
> > > and
> > > >> > Hadoop consumes once a day.  Let’s assume the topic consists of
> > 100k+
> > > >> > messages/sec throughput so that in a given day you might have 100s
> > > GBs of
> > > >> > data flowing through the topic.
> > > >> >
> > > >> > To allow Hadoop to consume once a day, Rabbit obviously can’t keep
> > > 100s
> > > >> GBs
> > > >> > in memory and will need to persist this data to its internal DB to
> > be
> > > >> > retrieved later.
> > > >>
> > > >> I am not sure why you think this is a problem?
> > > >>
> > > >> For a fixed number of producers and consumers, the pubsub and
> delivery
> > > >> semantics of Rabbit and Kafka are quite similar.  Think of Rabbit as
> > > >> adding an in-memory cache that is used to (a) speed up read
> > > >> consumption, (b) obviate disk writes when possible due to all client
> > > >> consumers being available and consuming.
> > > >>
> > > >>
> > > >> > I believe when large amounts of data need to be persisted
> > > >> > is the scenario described in the earlier posted Kafka paper (
> > > >> >
> > > >>
> > >
> >
> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
> > > >> )
> > > >> > where Rabbit’s performance really starts to bog down as compared
> to
> > > >> Kafka.
> > > >>
> > > >> Not sure what parts of the paper you mean?
> > > >>
> > > >> I read that paper when it came out.  I found it strongest when
> > > >> describing Kafka's design philosophy.  I found the performance
> > > >> statements made about Rabbit pretty hard to understand.  This is not
> > > >> meant to be a criticism of the authors!  I have seen very few
> > > >> performance papers about messaging that I would base decisions on.
> > > >>
> > > >>
> > > >> > This Kafka paper is looks to be a few years old
> > > >>
> > > >> Um....  Lots can change in technology very quickly :-)
> > > >>
> > > >> Eg.: At the time this paper was published, Instagram had 5m users.
> > > >> Six months earlier in Dec 2010, it had 1m.  Since then it grew huge
> > > >> and got acquired.
> > > >>
> > > >>
> > > >>
> > > >> > so has something changed
> > > >> > within the Rabbit architecture to alleviate this issue when large
> > > amounts
> > > >> > of data are persisted to the internal DB?
> > > >>
> > > >> Rabbit introduced a new internal flow control system which impacted
> > > >> performance under steady load.  This may be relevant?  I couldn't
> say
> > > >> from reading the paper.
> > > >>
> > > >> I don't have a good reference for this to hand, but here is a post
> > > >> about external flow control that you may find amusing:
> > > >>
> > > >>
> > >
> >
> http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/
> > > >>
> > > >>
> > > >> > Do the producer and consumer
> > > >> > numbers look correct?  If no, maybe you can share some Rabbit
> > > benchmarks
> > > >> > under this scenario, because I believe it is the main area where
> > Kafka
> > > >> > appears to be the superior solution.
> > > >>
> > > >> This is from about one year ago:
> > > >>
> > > >>
> > >
> >
> http://www.rabbitmq.com/blog/2012/04/25/rabbitmq-performance-measurements-part-2/
> > > >>
> > > >> Obviously none of this uses batching, which is an easy trick for
> > > >> increasing throughput.
> > > >>
> > > >> YMMV.
> > > >>
> > > >> Is this helping?
> > > >>
> > > >> alexis
> > > >>
> > > >>
> > > >>
> > > >> > Thanks for educating me on these matters.
> > > >> >
> > > >> > -Jonathan
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Fri, Jun 7, 2013 at 6:54 AM, Alexis Richardson <
> > > alexis@rabbitmq.com
> > > >> >wrote:
> > > >> >
> > > >> >> Hi
> > > >> >>
> > > >> >> Alexis from Rabbit here.  I hope I am not intruding!
> > > >> >>
> > > >> >> It would be super helpful if people with questions, observations
> or
> > > >> >> moans posted them to the rabbitmq list too :-)
> > > >> >>
> > > >> >> A few comments:
> > > >> >>
> > > >> >> * Along with ZeroMQ, I consider Kafka to be one of the
> interesting
> > > and
> > > >> >> useful messaging projects out there.  In a world of cruft, Kafka
> is
> > > >> >> cool!
> > > >> >>
> > > >> >> * This is because both projects come at messaging from a specific
> > > >> >> point of view that is *different* from Rabbit.  OTOH, many other
> > > >> >> projects exist that replicate Rabbit features for fun, or NIH, or
> > due
> > > >> >> to misunderstanding the semantics (yes, our docs could be better)
> > > >> >>
> > > >> >> * It is striking how few people describe those differences.  In a
> > > >> >> nutshell they are as follows:
> > > >> >>
> > > >> >> *** Kafka writes all incoming data to disk immediately, and then
> > > >> >> figures out who sees what.  So it is much more like a database
> than
> > > >> >> Rabbit, in that new consumers can appear well after the disk
> write
> > > and
> > > >> >> still subscribe to past messages.  Instead, Rabbit which tries to
> > > >> >> deliver to consumers and buffers otherwise.  Persistence is
> > optional
> > > >> >> but robust and a feature of the buffer ("queue") not the upstream
> > > >> >> machinery.  Rabbit is able to cache-on-arrival via a plugin, but
> > this
> > > >> >> is a design overlay and not particularly optimal.
> > > >> >>
> > > >> >> *** Kafka is a client server system with end to end semantics.
>  It
> > > >> >> defines order to include processing order, and keeps state on the
> > > >> >> client to do this.  Group management is via a 3rd party service
> > > >> >> (Zookeeper? I forget which).  Rabbit is a server-only protocol
> > based
> > > >> >> system which maintains order on the server and through completely
> > > >> >> language neutral protocol semantics.  This makes Rabbit perhaps
> > more
> > > >> >> natural as a 'messaging service' eg for integration and other
> > > >> >> inter-app data transfer.
> > > >> >>
> > > >> >> *** Rabbit is a general purpose messaging system with extras like
> > > >> >> federation.  It speaks many protocols, and has core features like
> > HA,
> > > >> >> transactions, management, etc.  Everything can be switched on or
> > off.
> > > >> >> Getting all this to work while keeping the install light and
> fast,
> > is
> > > >> >> quite fiddly.  Kafka by contrast comes from a specific set of use
> > > >> >> cases, which are interesting certainly.  I am not sure if Kafka
> > wants
> > > >> >> to be a general purpose messaging system, but it will become a
> bit
> > > >> >> more like Rabbit if that is the goal.
> > > >> >>
> > > >> >> *** Both approaches have costs.  In the case of Rabbit the cost
> is
> > > >> >> that more metadata is stored on the broker.  Kafka can get
> > > performance
> > > >> >> gains by storing less such data.  But we are talking about some N
> > > >> >> thousands of MPS versus some M thousands.  At those speeds the
> > > clients
> > > >> >> are usually the bottleneck anyway.
> > > >> >>
> > > >> >> * Let me also clarify some things:
> > > >> >>
> > > >> >> *** Rabbit does NOT store multiple copies of the same message
> > across
> > > >> >> queues, unless they are very small (<60b, iirc).  A message
> > delivered
> > > >> >> to >1 queue on 1 machine is stored once.  Metadata about that
> > message
> > > >> >> may be stored more than once, but, at scale, the big cost is the
> > > >> >> payload.
> > > >> >>
> > > >> >> *** Rabbit's vanilla install does store some index data in memory
> > > when
> > > >> >> messages flow to disk.  You can change this by using a plugin,
> but
> > > >> >> this is a secret-menu undocumented feature.  Very very few people
> > > need
> > > >> >> any such thing.
> > > >> >>
> > > >> >> *** A Rabbit queue is lightweight.  It's just an ordered
> > consumption
> > > >> >> buffer that can persist and ack.  Don't assume things about
> Rabbit
> > > >> >> queues based on what you know about IBM MQ, JMS, and so forth.
> > >  Queues
> > > >> >> in Rabbit and Kafka are not the same.
> > > >> >>
> > > >> >> *** Rabbit does not use mnesia for message storage.  It has its
> own
> > > >> >> DB, optimised for messaging.  You can use other DBs but this is
> > > >> >> Complicated.
> > > >> >>
> > > >> >> *** Rabbit does all kinds of batching and bulk processing, and
> can
> > > >> >> batch end to end.  If you see claims about batching, buffering,
> > etc.,
> > > >> >> find out ALL the details before drawing conclusions.
> > > >> >>
> > > >> >> I hope this is helpful.
> > > >> >>
> > > >> >> Keen to get feedback / questions / corrections.
> > > >> >>
> > > >> >> alexis
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> On Fri, Jun 7, 2013 at 2:09 AM, Marc Labbe <mrlabbe@gmail.com>
> > > wrote:
> > > >> >> > We also went through the same decision making and our arguments
> > for
> > > >> Kafka
> > > >> >> > where in the same lines as those Jonathan mentioned. The fact
> > that
> > > we
> > > >> >> have
> > > >> >> > heterogeneous consumers is really a deciding factor. Our
> > > requirements
> > > >> >> were
> > > >> >> > to avoid loosing messages at all cost while having multiple
> > > consumers
> > > >> >> > reading the same data at a different pace. On one side, we
> have a
> > > few
> > > >> >> > consumers being fed with data coming in from most, if not all,
> > > >> topics. On
> > > >> >> > the other side, we have a good bunch of consumers reading only
> > > from a
> > > >> >> > single topic. The big guys can take their time to read while
> the
> > > >> smaller
> > > >> >> > ones are mostly for near real-time events so they need to keep
> up
> > > the
> > > >> >> pace
> > > >> >> > of incoming messages.
> > > >> >> >
> > > >> >> > RabbitMQ stores data on disk only if you tell it to while Kafka
> > > >> persists
> > > >> >> by
> > > >> >> > design. From the beginning, we decided we would try to use the
> > > queues
> > > >> the
> > > >> >> > same way, pub/sub with a routing key (an exchange in RabbitMQ)
> or
> > > >> topic,
> > > >> >> > persisted to disk and replicated.
> > > >> >> >
> > > >> >> > One of our scenario was to see how the system would cope with
> the
> > > >> largest
> > > >> >> > consumer down for a while, therefore forcing the brokers to
> keep
> > > the
> > > >> data
> > > >> >> > for a long period. In the case of RabbitMQ, this consumer has
> it
> > > owns
> > > >> >> queue
> > > >> >> > and data grows on disk, which is not really a problem if you
> plan
> > > >> >> > consequently. But, since it has to keep track of all messages
> > read,
> > > >> the
> > > >> >> > Mnesia database used by RabbitMQ as the messages index also
> grows
> > > >> pretty
> > > >> >> > big. At that point, the amount of RAM necessary becomes very
> > large
> > > to
> > > >> >> keep
> > > >> >> > the level of performance we need. In our tests, we found that
> > this
> > > an
> > > >> >> > adverse effect on ALL the brokers, thus affecting all
> consumers.
> > > You
> > > >> can
> > > >> >> > always say that you'll monitor the consumers to make sure it
> > won't
> > > >> >> happen.
> > > >> >> > That's a good thing if you can. I wasn't ready to make that
> bet.
> > > >> >> >
> > > >> >> > Another point is the fact that, since we wanted to use pub/sub
> > > with a
> > > >> >> > exchange in RabbitMQ, we would have ended up with a lot data
> > > >> duplication
> > > >> >> > because if a message is read by multiple consumers, it will get
> > > >> >> duplicated
> > > >> >> > in the queue of each of those consumer. Kafka wins on that side
> > too
> > > >> since
> > > >> >> > every consumer reads from the same source.
> > > >> >> >
> > > >> >> > The downsides of Kafka were the language issues (we are using
> > > mostly
> > > >> >> Python
> > > >> >> > and C#). 0.8 is very new and few drivers are available at this
> > > point.
> > > >> >> Also,
> > > >> >> > we will have to try getting as close as possible to
> > > once-and-only-once
> > > >> >> > guarantee. There are two things where RabbitMQ would have given
> > us
> > > >> less
> > > >> >> > work out of the box as opposed to Kafka. RabbitMQ also
> provides a
> > > >> bunch
> > > >> >> of
> > > >> >> > tools that makes it rather attractive too.
> > > >> >> >
> > > >> >> > In the end, looking at throughput is a pretty nifty thing but
> > being
> > > >> sure
> > > >> >> > that I'll be able to manage the beast as it grows will allow me
> > to
> > > >> get to
> > > >> >> > sleep way more easily.
> > > >> >> >
> > > >> >> >
> > > >> >> > On Thu, Jun 6, 2013 at 3:28 PM, Jonathan Hodges <
> > hodgesz@gmail.com
> > > >
> > > >> >> wrote:
> > > >> >> >
> > > >> >> >> We just went through a similar exercise with RabbitMQ at our
> > > company
> > > >> >> with
> > > >> >> >> streaming activity data from our various web properties.  Our
> > use
> > > >> case
> > > >> >> >> requires consumption of this stream by many heterogeneous
> > > consumers
> > > >> >> >> including batch (Hadoop) and real-time (Storm).  We pointed
> out
> > > that
> > > >> >> Kafka
> > > >> >> >> acts as a configurable rolling window of time on the activity
> > > stream.
> > > >> >>  The
> > > >> >> >> window default is 7 days which allows for supporting clients
> of
> > > >> >> different
> > > >> >> >> latencies like Hadoop and Storm to read from the same stream.
> > > >> >> >>
> > > >> >> >> We pointed out that the Kafka brokers don't need to maintain
> > > consumer
> > > >> >> state
> > > >> >> >> in the stream and only have to maintain one copy of the stream
> > to
> > > >> >> support N
> > > >> >> >> number of consumers.  Rabbit brokers on the other hand have to
> > > >> maintain
> > > >> >> the
> > > >> >> >> state of each consumer as well as create a copy of the stream
> > for
> > > >> each
> > > >> >> >> consumer.  In our scenario we have 10-20 consumers and with
> the
> > > scale
> > > >> >> and
> > > >> >> >> throughput of the activity stream we were able to show Rabbit
> > > quickly
> > > >> >> >> becomes the bottleneck under load.
> > > >> >> >>
> > > >> >> >>
> > > >> >> >>
> > > >> >> >> On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu <
> > > >> >> >> Dragos.Manolescu@servicenow.com> wrote:
> > > >> >> >>
> > > >> >> >> > Hi --
> > > >> >> >> >
> > > >> >> >> > I am preparing to make a case for using Kafka instead of
> > Rabbit
> > > MQ
> > > >> as
> > > >> >> a
> > > >> >> >> > broker-based messaging provider. The context is similar to
> > that
> > > of
> > > >> the
> > > >> >> >> > Kafka papers and user stories: the producers publish
> > monitoring
> > > >> data
> > > >> >> and
> > > >> >> >> > logs, and a suite of subscribers consume this data (some
> store
> > > it,
> > > >> >> others
> > > >> >> >> > perform computations on the event stream). The requirements
> > are
> > > >> >> typical
> > > >> >> >> of
> > > >> >> >> > this context: low-latency, high-throughput, ability to deal
> > with
> > > >> >> bursts
> > > >> >> >> and
> > > >> >> >> > operate in/across multiple data centers, etc.
> > > >> >> >> >
> > > >> >> >> > I am familiar with the performance comparison between Kafka,
> > > >> Rabbit MQ
> > > >> >> >> and
> > > >> >> >> > Active MQ from the NetDB 2011 paper<
> > > >> >> >> >
> > > >> >> >>
> > > >> >>
> > > >>
> > >
> >
> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
> > > >> >> >> >.
> > > >> >> >> > However in the two years that passed since then the number
> of
> > > >> >> production
> > > >> >> >> > Kafka installations increased, and people are using it in
> > > different
> > > >> >> ways
> > > >> >> >> > than those imagined by Kafka's designers. In light of these
> > > >> >> experiences
> > > >> >> >> one
> > > >> >> >> > can use more data points and color when contrasting to
> Rabbit
> > MQ
> > > >> >> (which
> > > >> >> >> by
> > > >> >> >> > the way also evolved since 2011). (And FWIW I know I am not
> > the
> > > >> first
> > > >> >> one
> > > >> >> >> > to walk this path; see for example last year's OSCON session
> > on
> > > the
> > > >> >> State
> > > >> >> >> > of MQ<http://lanyrd.com/2012/oscon/swrcz/>.)
> > > >> >> >> >
> > > >> >> >> > I would appreciate it if you could share measurements,
> > results,
> > > or
> > > >> >> even
> > > >> >> >> > anecdotal evidence along these lines. How have you avoided
> the
> > > >> "let's
> > > >> >> use
> > > >> >> >> > Rabbit MQ because everybody else does it" route when solving
> > > >> problems
> > > >> >> for
> > > >> >> >> > which Kafka is a better fit?
> > > >> >> >> >
> > > >> >> >> > Thanks,
> > > >> >> >> >
> > > >> >> >> > -Dragos
> > > >> >> >> >
> > > >> >> >>
> > > >> >>
> > > >>
> > >
> >
> >
>
>
> --
> --
> Evan Chan
> Staff Engineer
> ev@ooyala.com  |
>
> <http://www.ooyala.com/>
> <http://www.facebook.com/ooyala><http://www.linkedin.com/company/ooyala><
> http://www.twitter.com/ooyala>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message