kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu, Nan" <n...@baml.com.INVALID>
Subject kafka latency for large message
Date Thu, 14 Mar 2019 20:43:03 GMT
Hi, 
   
    We are using kafka to send messages and there is less than 1% of message is very big,
close to 30M. understanding kafka is not ideal for sending big messages, because the large
message rate is very low, we just want let kafka do it anyway. But still want to get a reasonable
latency.

    To test, I just setup up a topic test on a single broker local kafka,  with only 1 partition
and 1 replica, using the following command

./kafka-producer-perf-test.sh  --topic test --num-records 2000000  --throughput 1 --record-size
30000000 --producer.config ../config/producer.properties

Producer.config

#Max 40M message
max.request.size=40000000
buffer.memory=40000000

#2M buffer
send.buffer.bytes=2000000

6 records sent, 1.1 records/sec (31.00 MB/sec), 973.0 ms avg latency, 1386.0 max latency.
6 records sent, 1.0 records/sec (28.91 MB/sec), 787.2 ms avg latency, 1313.0 max latency.
5 records sent, 1.0 records/sec (27.92 MB/sec), 582.8 ms avg latency, 643.0 max latency.
6 records sent, 1.1 records/sec (30.16 MB/sec), 685.3 ms avg latency, 1171.0 max latency.
5 records sent, 1.0 records/sec (27.92 MB/sec), 629.4 ms avg latency, 729.0 max latency.
5 records sent, 1.0 records/sec (27.61 MB/sec), 635.6 ms avg latency, 673.0 max latency.
6 records sent, 1.1 records/sec (30.09 MB/sec), 736.2 ms avg latency, 1255.0 max latency.
5 records sent, 1.0 records/sec (27.62 MB/sec), 626.8 ms avg latency, 685.0 max latency.
5 records sent, 1.0 records/sec (28.38 MB/sec), 608.8 ms avg latency, 685.0 max latency.


On the broker, I change the 

socket.send.buffer.bytes=2024000
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=2224000

and all others are default.

I am a little surprised to see about 1 s max latency and average about 0.5 s. my understanding
is kafka is doing the memory mapping for log file and let system flush it. all the write are
sequential. So flush should be not affected by message size that much. Batching and network
will take longer, but those are memory based and local machine. my ssd should be far better
than 0.5 second. where the time got consumed? any suggestion?

Thanks,
Nan







----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information
that is privileged, confidential and/or proprietary and subject to important terms and conditions
available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient,
please delete this message.

Mime
View raw message