kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominik Safaric <dominiksafa...@gmail.com>
Subject Efficient Kafka batch processing
Date Sat, 10 Dec 2016 18:29:48 GMT
Hi everyone,

What is among the most efficient ways to fast consume, transform and process Kafka messages?
Importantly, I am not referring nor interested in streams, because the Kafka topic from which
I would like to process the messages will eventually stop receiving messages, after which
I should process the messages by extracting certain keys in a batch processing like manner.

So far I’ve implemented a a Kafka Consumer group that consumers these messages, hashes them
according to a certain key, and upon retrieval of the last message starts the processing script.
However, I am dealing with exactly 100.000.000 log messages, each of 16 bytes, meaning that
preserving 1.6GB of data in-memory i.e. on heap is not the most efficient manner - performance
and memory wise.

View raw message