kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruno D. Rodrigues" <bruno.rodrig...@litux.org>
Subject Re: Anyone running kafka with a single broker in production? what about only 8GB ram?
Date Fri, 11 Oct 2013 17:27:01 GMT

> On Thu, Oct 10, 2013 at 3:57 PM, Bruno D. Rodrigues <
> bruno.rodrigues@litux.org> wrote:
>> My personal newbie experience, which is surely completely wrong and
>> miss-configured, got me up to 70MB/sec, either with controlled 1K messages
>> (hence 70Kmsg/sec) as well as with more random data (test data from 100
>> bytes to a couple MB). First I thought the 70MB were the hard disk limit,
>> but when I got the same result both with a proper linux server with a 10K
>> disk, as well as with a Mac mini with a 5400rpm disk, I got confused.
>> The mini has 2G, the linux server has 8 or 16, can'r recall at the moment.
>> The test was performed both with single and multi producers and consumers.
>> One producer = 70MB, two producers = 35MB each and so forth. Running
>> standalone instances on each server, same value. Running both together in 2
>> partition 2 replica crossed mode, same result.
>> As far as I understood, more memory just means more kernel buffer space to
>> speed up the lack of disk speed, as kafka seems to not really depend on
>> memory for the queueing.

A 11/10/2013, às 17:28, Guozhang Wang <wangguoz@gmail.com> escreveu:

> Hello,
> In most cases of Kafka, network bottleneck will be hit before the disk
> bottleneck. So maybe you want to check your network capacity to see if it
> has been saturated.

They are all connected to Gbit ethernet cards and proper network routers. I can easily get
way above 950Mbps up and down between each machine and even between multiple machines. Gbit
is 128MB/s. 70MB/s is 560Kbps. So far so good, 56% network capacity is a goodish value. But
then I enable snappy, get the same 70MB on the input and output side, and 20MB/sec on the
network, so it surely isn't network limits. It's also not on the input or output side - the
input reads a pre-processed MMaped file that reads at 150MB/sec without cache (SSD) up to
3GB/sec when loaded into memory. The output simply counts the messages and size of them.

One weird thing is that the kafka process seems to not cross the 100% cpu on the top or equivalent
command. Top shows 100% for each CPU, so a multi-threaded process should go up to 400% (both
the linux and mac mini are 2 CPU with hiperthreading, so "almost" 4 cpus).

View raw message