nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <christophe.mon...@post.ch>
Subject RE: ConsumeKafkaRecord Performance Issue
Date Fri, 19 Jun 2020 06:53:56 GMT
Hi Josef

I noticed that you run kafka-consumer-perf-test.sh of Kafka 2.3.1 but NiFi is bundled with
kafka-clients-2.0.0.jar
Maybe you could try the performance test with the same client version?

What is the version of your kafka brokers?

Regards
Chris

From: Josef.Zahner1@swisscom.com <Josef.Zahner1@swisscom.com>
Sent: Friday, 19. June 2020 07:55
To: users@nifi.apache.org
Subject: ConsumeKafkaRecord Performance Issue

Hi guys,

We have faced a strange behavior of the ConsumeKafkaRecord processor (and it’s pendant ConsumeKafka).
We have a kafka Topic with 15 partitions and a producer which inserts via NiFi in peak about
40k records per second to the topic. The thing is now, it doesn’t matter whether we are
using the 8-Node Cluster or configuring execution on “Primary Node”, the performance is
terrible. We made a test with execution on “Primary Node” and started with one thread,
the result can you see below. As soon as we reached 3 threads the performance went down and
never went higher than that, doesn’t matter how many threads or cluster nodes. We tried
2 threads in the 8 node cluster (16 threads in total) and even more. Didn’t help, we stuck
at this 12’000’000 – 14’000’000 records per 5 min (so round about 45k records per
second). Btw. for the tests we were always behind the offset, so there were a lot of messages
in the kafka queue.

[A close up of a map  Description automatically generated]


We also tested with the performance script which comes with kafka. It showed 250k messages/s
without any tuning at all (however without any decoding of the messages of course). So in
theory kafka and the network in between couldn’t be the culprit. It must be something within
NiFi.

[user@nifi ~]$ /opt/kafka_2.12-2.3.1/bin/kafka-consumer-perf-test.sh --broker-list kafka.xyz.net:9093<http://kafka.sbd.corproot.net:9093/>
--group nifi --topic events --consumer.config /opt/sbd_kafka/credentials_prod/client-ssl.properties
--messages 3000000

start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms,
fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2020-06-15 17:20:05:273, 2020-06-15 17:20:20:429, 515.7424, 34.0289, 3000000, 197941.4093,
3112, 12044, 42.8215, 249086.6822


We have also seen that “Max Poll Records” in our case never gets reached, we had in max.
about 400 records in one flowfile even though we configured 100’000 - which could be a part
of the problem.

[cid:image003.png@01D64617.29A5EB60]

Seems that I’m not alone with my issue, even though his performance was even worse than
ours:
https://stackoverflow.com/questions/62104646/nifi-poor-performance-of-consumekafkarecord-2-0-and-consumekafka-2-0

Any help would be really appreciated.

If nobody has an idea I have to open a bug ticket :-(.

Cheers, Josef


Mime
View raw message