kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Levart <peter.lev...@gmail.com>
Subject Re: KTable.suppress(Suppressed.untilWindowCloses) does not suppress some non-final results when the kafka streams process is restarted
Date Tue, 08 Jan 2019 11:57:46 GMT
Hi John,

On 1/8/19 12:45 PM, Peter Levart wrote:
>> I looked at your custom transfomer, and it looks almost correct to 
>> me. The
>> only flaw seems to be that it only looks
>> for closed windows for the key currently being processed, which means 
>> that
>> if you have key "A" buffered, but don't get another event for it for a
>> while after the window closes, you won't emit the final result. This 
>> might
>> actually take longer than the window retention period, in which case, 
>> the
>> data would be deleted without ever emitting the final result.
>
> So in DSL case, the suppression works by flushing *all* of the "ripe" 
> windows in the whole buffer whenever a singe event comes in with 
> recent enough timestamp regardless of the key of that event?
>
> Is the buffer shared among processing tasks or does each task maintain 
> its own private buffer that only contains its share of data pertaining 
> to assigned input partitions? In case the tasks are executed on 
> several processing JVM(s) the buffer can't really be shared, right? In 
> that case a single event can't flush all of the "ripe" windows, but 
> just those that are contained in the task's part of buffer... 

Just a question about your comment above:

/"This might actually take longer than the window retention period, in 
which case, the data would be deleted without ever emitting the final 
result"/

Are you talking about the buffer log topic retention? Aren't log topics 
configured to "compact" rather than "delete" messages? So the last 
"version" of the buffer entry for a particular key should stay forever? 
What are the keys in suppression buffer log topic? Are they a pair of 
(timestamp, key) ? Probably not since in that case the compacted log 
would grow indefinitely...

Another question:

What are the keys in WindowStore's log topic? If the input keys to the 
processor that uses such WindowStore consist of a bounded set of values 
(for example user ids), would compacted log of such WindowStore also be 
bounded?

Regards, Peter


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message