samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bae, Jae Hyeon" <metac...@gmail.com>
Subject Re: How to synchronize KeyValueStore and Kafka cleanup
Date Fri, 02 Oct 2015 20:51:38 GMT
I found the following statement from Samza documentation:

"Periodically the job scans over both stores and deletes any old events
that were not matched within the time window of the join."

It seems that I have to manually implement purging KeyValueStore, did I
understand correctly?

On Fri, Oct 2, 2015 at 1:43 PM, Bae, Jae Hyeon <metacret@gmail.com> wrote:

> Hi Samza devs and users
>
> This is my first try with KeyValueStore and I am really excited!
>
> I glanced through TaskStorageManager source code, it looks creates
> consumers for stores and I am wondering that how kafka cleanup will be
> propagated to KeyValueStore.
>
> My KeyValueStore usage is a little bit different from usual cases because
>  I have to cache all unique ids for the past six hours, which can be
> configured for the retention usage. Unique ids won't be repeated such as
> timestamp. In this case, log.cleanup.policy=compact will keep growing the
> KeyValueStore size, right?
>
> Can I use Samza KeyValueStore for the topics
> with log.cleanup.policy=delete? If not, what's your recommended way for
> state management of non-changelog Kafka topic? If it's possible, how does
> Kafka cleanup remove outdated records in KeyValueStore?
>
> Thank you
> Best, Jae
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message