samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <jh...@apache.org>
Subject Re: How to do aggregation in Samza?
Date Wed, 21 Oct 2015 16:17:19 GMT
I am helping with the SQL support. I don’t know timelines but I wanted to chime in on the
different aggregate operations.

There are several ways to aggregate streams: tumbling, hopping, sliding windows. For example,
if you want to periodically emit totals that collapse many rows into one total, you want a
tumbling window. Read through [1] and find out which kind of window you need for your application.
Ultimately Samza will support each of these kinds of windows, via SQL and via its API.

Julian

[1] https://calcite.incubator.apache.org/docs/stream.html

> On Oct 21, 2015, at 8:15 AM, jeremy p <athomewithagroovebox@gmail.com> wrote:
> 
> Hey all,
> 
> So, I'm wanting to do aggregate operations in Samza.  Counts, averages,
> grouping, things of that nature.  Basically, the kinds of aggregate
> operations you can do in SQL.  What's the best way to do this in Samza?
> Are there any libraries for this?
> 
> I noticed a few projects are currently in development.  Have any of them
> been used in a production environment?  Are any of them being actively
> developed?
> 
> From combing through the mailing list archives, I can see that LinkedIn
> plans on adding SQL and aggregation to Samza.  Any idea what the timeline
> is for this?
> 
> --Jeremy


Mime
View raw message