kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Levart <peter.lev...@gmail.com>
Subject Re: KTable.suppress(Suppressed.untilWindowCloses) does not suppress some non-final results when the kafka streams process is restarted
Date Tue, 08 Jan 2019 12:48:14 GMT


On 1/8/19 12:57 PM, Peter Levart wrote:
> Hi John,
>
> On 1/8/19 12:45 PM, Peter Levart wrote:
>>> I looked at your custom transfomer, and it looks almost correct to 
>>> me. The
>>> only flaw seems to be that it only looks
>>> for closed windows for the key currently being processed, which 
>>> means that
>>> if you have key "A" buffered, but don't get another event for it for a
>>> while after the window closes, you won't emit the final result. This 
>>> might
>>> actually take longer than the window retention period, in which 
>>> case, the
>>> data would be deleted without ever emitting the final result.
>>
>> So in DSL case, the suppression works by flushing *all* of the "ripe" 
>> windows in the whole buffer whenever a singe event comes in with 
>> recent enough timestamp regardless of the key of that event?
>>
>> Is the buffer shared among processing tasks or does each task 
>> maintain its own private buffer that only contains its share of data 
>> pertaining to assigned input partitions? In case the tasks are 
>> executed on several processing JVM(s) the buffer can't really be 
>> shared, right? In that case a single event can't flush all of the 
>> "ripe" windows, but just those that are contained in the task's part 
>> of buffer... 
>
> Just a question about your comment above:
>
> /"This might actually take longer than the window retention period, in 
> which case, the data would be deleted without ever emitting the final 
> result"/
>
> Are you talking about the buffer log topic retention? Aren't log 
> topics configured to "compact" rather than "delete" messages? So the 
> last "version" of the buffer entry for a particular key should stay 
> forever? What are the keys in suppression buffer log topic? Are they a 
> pair of (timestamp, key) ? Probably not since in that case the 
> compacted log would grow indefinitely...
>
> Another question:
>
> What are the keys in WindowStore's log topic? If the input keys to the 
> processor that uses such WindowStore consist of a bounded set of 
> values (for example user ids), would compacted log of such WindowStore 
> also be bounded?

In case the key of WindowStore log topic is (timestamp, key) then would 
explicitly deleting flushed entries from WindowStore (by putting null 
value into the store) keep the compacted log bounded? In other words, 
does WindowStore log topic support a special kind of "tombstone" message 
that effectively removes the key from the compacted log?

In that case, my custom processor could keep entries in its WindowStore 
for as log as needed, depending on the activity of a particular input key...

>
> Regards, Peter
>
>


Mime
View raw message