samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bae, Jae Hyeon" <>
Subject Re: How to synchronize KeyValueStore and Kafka cleanup
Date Fri, 02 Oct 2015 20:51:38 GMT
I found the following statement from Samza documentation:

"Periodically the job scans over both stores and deletes any old events
that were not matched within the time window of the join."

It seems that I have to manually implement purging KeyValueStore, did I
understand correctly?

On Fri, Oct 2, 2015 at 1:43 PM, Bae, Jae Hyeon <> wrote:

> Hi Samza devs and users
> This is my first try with KeyValueStore and I am really excited!
> I glanced through TaskStorageManager source code, it looks creates
> consumers for stores and I am wondering that how kafka cleanup will be
> propagated to KeyValueStore.
> My KeyValueStore usage is a little bit different from usual cases because
>  I have to cache all unique ids for the past six hours, which can be
> configured for the retention usage. Unique ids won't be repeated such as
> timestamp. In this case, log.cleanup.policy=compact will keep growing the
> KeyValueStore size, right?
> Can I use Samza KeyValueStore for the topics
> with log.cleanup.policy=delete? If not, what's your recommended way for
> state management of non-changelog Kafka topic? If it's possible, how does
> Kafka cleanup remove outdated records in KeyValueStore?
> Thank you
> Best, Jae

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message