kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Navneeth Krishnan <reachnavnee...@gmail.com>
Subject Kafka Streams Optimizations
Date Mon, 07 Dec 2020 06:32:44 GMT
Hi All,

I have been working on moving an application to kafka streams and I have
the following questions.

1. We were planning to use an EFS mount to share rocksdb data for KV store
and global state store with which we were hoping to minimize the state
restore time when new instances are brought up. But later we found that
global state stores require a lock so directories cannot be shared. Is
there some way around this? How is everyone minimizing the state
restoration time?

2. Topology optimization: We are using PAPI and as per the docs topology
optimization will have no effects on low level api. Is my understanding
correct?

3. There are about 5 KV stores in our stream application and for a few the
data size is a bit larger. Is there a config to write data to the changelog
topic only once a minute or something? I know it will be a problem in
maintaining the data integrity. Basically we want to reduce the amount of
changelog data written since we will have some updates for each user every
5 secs or so. Any suggestions on optimizations.

4. Compress data: Is there an option to compress the data being sent and
consumed from kafka only for the intermediate topics. The major reason is
we don't want to change the final sink because it's used by many
applications. If we can just compress and write the data only for the
intermediate topics and changelog that would be nice.

Thanks and appreciate all the help.

Regards,
Navneeth

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message