kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eno Thereska <eno.there...@gmail.com>
Subject Re: Kafka Streams windowed aggregation
Date Wed, 05 Oct 2016 10:00:20 GMT
Hi Davood,

The behaviour is indeed as you say. Recently we checked in KIP-63 in trunk (it will be part
of the 0.10.1 release coming up). That should reduce the amount of downstream traffic you
see (https://cwiki.apache.org/confluence/display/KAFKA/KIP-63%3A+Unify+store+and+downstream+caching+in+streams).
However your app should still be prepared to receive multiple downstream records. The optimisation
is such that many of them will be de-duped if they have the same key.

Triggers are something we are thinking about but we haven't decided to put it on the roadmap
yet.

Thanks
Eno

> On 5 Oct 2016, at 08:55, Davood Rafiei <rafieidavood9@gmail.com> wrote:
> 
> Hi,
> 
> I want to do windowed aggregation with streams library. However, I get the
> output from particular operator immediately, independent of window size.
> This makes sense for unlimited windows or sometimes for event time windows.
> However, for ingestion time or processing time windows, users may want to
> exact results (and in exact time) of windowed aggregation operator.
> For example, if I have window of 4 minutes with 2 minutes slide, I would
> expect to get an output once per 2 minutes. Otherwise I cannot know which
> one of the outputted tuples from aggregator operator is the "right" that
> contains aggregation result of whole window.
> One solution for this, is using queryable state, but pulling states
> regularly to get latest answers is not useful for my usecase.
> 
> So, is it on your roadmap to integrate purge/trigger mechanism to windowed
> aggregates?
> 
> Thanks
> Davood


Mime
View raw message