kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Yeargers <jon.yearg...@cedexis.com>
Subject KTables + aggregation - how to make lots of ppl happy
Date Thu, 01 Dec 2016 16:43:49 GMT
Seems like there are many questions on SO and related about "how do I know
when my windowed aggregation is 'done'?" The answer has always been "It's
never done. You're thinking about it the wrong way."

I propose a new function for KStream:

finiteAggregateByKey(Initializer, Aggregator, Windows, Timeout)

where 'Timeout' would be a struct of how many seconds after the window is
finished and a topic to dump the resultant values to.

EG I have a topic 'bar' with an 'rolling' aggregator of 20 minutes / 1
minute. During the 20 minutes it's building a list of things from the 'bar'
topic. I tell it to terminate 5 minutes after the initial 20 have elapsed
and write out the final values from 'T
<https://kafka.apache.org/0100/javadoc/org/apache/kafka/streams/kstream/Aggregator.html>
 aggregate' to a topic 'allBar'. This would allow me to put a consumer on
that topic and watch for new entries.

Subsequent values that would have matched said aggregator-window above are
ignored.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message