kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Revin Chalil <rcha...@expedia.com>
Subject Re: performance test using real data - comparing throughput & latency
Date Fri, 15 Sep 2017 19:01:13 GMT
Any thoughts on the below will be appreciated. Thanks. 

On 9/13/17, 5:00 PM, "Revin Chalil" <rchalil@expedia.com> wrote:

    We are testing kafka’s performance with the real prod data and plan to test things like
the below. We would have producers publishing and consumers processing production data on
a separate non-prod kafka cluster.
      *   Impact of number of Partitions per Topic on throughput and latency on Producer &
      *   Impact of scaling-up Brokers on throughput and latency
      *   adding more brokers Vs adding more Disk on existing Brokers. How does the network
interface usage differ?
      *   cost of Replication on Throughput and Latency
      *   impact of Broker vm.swappiness = 60 Vs vm.swappiness = 1
      *   partitions on a Broker pointing to single Disk Vs multiple Disks
      *   EXT4 Vs XFS Filesystem on broker
      *   behavior when Broker “num.io<http://num.io/>.threads” is increased from
8 to higher value
      *   behavior when Broker “num.network.threads” is increased from 3  to higher value
      *   behavior when the data segment size is increased from 1 GB (current setting)
      *   producer “acks = 1” Vs “acks = all” (current setting) impact on throughput
and latency
      *   producer sending with Compression enabled (snappy?) Vs sending without Compression
      *   setting producer batch-size (memory based) Vs record-count (current setting) per
batch sent to Kafka
      *   impact of message size throughput
      *   Consumers fetching records from page-cache Vs fetching records from Disk
    Ideally, the metrics we would like to compare for each test are (please let know if there
are anything else to be compared)
      *   Producer write Throughput
      *   Producer write Latency (ms)
      *   Consumption Throughput
      *   Consumption Latency (ms)
      *   End-to-end Latency
    What would be the right tools to collect and compare the above metrics against different
Tests? I have setup kafka-monitor but couldn’t find how to track the throughput and latency.
Kafka-web-console seems to have some of these available? Kafka-Manager? Burrow? Anything else?
Thank you.
    Since we are going to use our own producers and consumers, I do not think it makes sense
to use tools like kafka-consumer-perf-test.sh or kafka-producer-perf-test.sh, but please correct
if I am wrong.

View raw message