kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Kafka questions
Date Tue, 19 Jul 2011 02:14:01 GMT
Paul,

Excellent questions. See my answers below. Thanks,

On Mon, Jul 18, 2011 at 6:41 PM, Paul Sutter <psutter@quantbench.com> wrote:

> Kafka looks like an exciting project, thanks for opening it up.
>
> I have a few questions:
>
> 1. Are checksums end to end (ie, created by the producer and checked by the
> consumer)? or are they only used to confirm buffercache behavior on disk as
> mentioned in the documentation? Bit errors occur vastly more often than
> most
> people assume, often because of device driver bugs. TCP only detects 1
> error
> in 65536, so errors can flow through (if you like I can send links to
> papers
> describing the need for checksums everywhere).
>

Checksum is generated at the producer and propagated to the broker and
eventually the consumer. Currently, we only validate the checksum at the
broker. We could further validate it at the consumer in the future.

>
> 2. The consumer has a pretty solid mechanism to ensure it hasnt missed any
> messages (i like the design by the way), but how does the producer know
> that
> all of its messages have been stored? (no apparent message id on that side
> since the message id isnt known until the message is written to the file).
> I'm especially curious how failover/replication could be implemented and
> I'm
> thinking that acks on the publisher side may help)
>

The producer side auditing is not built-in. At LinkedIn, we do that by
generating an auditing event periodically in the eventhandler of the async
producer. The auditing event contains the number of events produced in a
configured window (e.g., 10 minutes) and are sent to a separate topic. The
consumer can read the actual data and the auditing event and compare the
counts. See our paper (
http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf)
for some more details.


>
> 3. Has the consumer's flow control been tested over high bandwidth*delay
> links? (what bandwidth can you get from a London consumer of an SF
> cluster?)
>
> Yes, we actually replicate kafka data across data centers, using an
embedded consumer in a broker. Again, there is a bit more info on this in
our paper.


> 4. What kind of performance do you get if you set the producer's message
> delay to zero? (ie, is there a separate system call for each message? or do
> you manage to aggregate messages into a smaller number of system calls even
> with a delay of 0?)
>
> I assume that you are referring to the flush interval. One can configure to
flush every message to disk. This will slow down the throughput
significantly.


> 5. Have you considered using a library like zeromq for the messaging layer
> instead of rolling your own? (zeromq will handle #4 cleanly at millions of
> messages per second and has clients in 20 languages)
>
> No. Our proprietary format allows us to support things like compression in
the future. However, we can definitely look into the zeromq format. Is their
messaging layer easily extractable?


> 6. Do you have any plans to support intermediate processing elements the
> way
> Flume supports?
>
> For now, we are just focusing on getting the raw messaging layer solid. We
have worked a bit on streaming processing and will look into that again in
the future.


> 7. The docs mention that new versions will only be released after they are
> in production at LinkedIn? Does that mean that the latest version of the
> source code is hidden at LinkedIn and contributors would have to throw
> patches over the wall and wait months to get the integrated product?
>
> What we ran at LinkedIn is the same version in open source and there is no
internal repository of Kafka at LinkedIn. We plan to maintain that in the
future.


> Thanks!
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message