kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: How does the /tmp/kafka-streams folder work?
Date Thu, 27 Dec 2018 20:57:30 GMT
All data is backed in the Kafka cluster. Data that is stored locally, is
basically a cache, and Kafka Streams will recreate the local data if you
loose it.

Thus, I am not sure how the KTable data could be stale. One possibility
might be a miss-configuration: I assume that you read the topic directly
as a table (ie, builder.table("topic")). If you do this, the used input
topic must be configured with log compaction --- if it is configured
with retention, you might loose data from the input topic and if you
also loose the local cache, Kafka Streams cannot recreate the local
state because it was deleted from the topic (log compaction will guard
the input topic from data loss).


-Matthias


On 12/24/18 12:22 PM, Edmondo Porcu wrote:
> Hello Kafka users,
> 
> we are running a Kafka Streams as a fully stateless application, meaning
> that we are not persisting /tmp/kafka-streams on a durable volume but we
> are rather losing it at each restart. This application is performing a
> KTable-KTable join of data coming from Kafka Connect, and sometimes we want
> to force the output to tick so we update records in the right table from
> the database, but we see that the left table is "stale".
> 
> Is it possible that because of reboots, the application loses some messages
> ? How is the state reconstructed when /tmp/kafka-streams is not available?
> Is the state saved in an intermediate topic?
> 
> Thanks,
> Edmondo
> 


Mime
View raw message