samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Riccomini <criccom...@apache.org>
Subject Re: Handling defaults and windowed aggregates in stream queries
Date Mon, 02 Mar 2015 00:20:05 GMT
Hey Milinda,

Are you referring to this thread?

http://mail-archives.apache.org/mod_mbox/incubator-calcite-dev/201502.mbox/%3CCACwebjTshFNi=eS4QZ1ZKkQMUYGZN+LWj_bAPqPdrVSY2tQW1A@mail.gmail.com%3E

It appears as though your question remains unanswered. :(

> If we consider LogicalFilter as a relation operator, we need something to convert
input stream to a relation before sending the tuples downstream.

Does CQL model this as a window operator where every window is exactly 1
row (i.e. a sliding window of length 1)?

Cheers,
Chris

On Thu, Feb 26, 2015 at 12:18 PM, Milinda Pathirage <mpathira@umail.iu.edu>
wrote:

> Hi devs,
>
> I ask about $subject in calcite-dev. You can find the archived discussion
> at [1]. I think your thoughts are also valuable in this discussion in
> calcite list.
>
> I discovered the requirement for a default window operator when I tried to
> integrate streamscan (I was using tablescan prevously) into the physical
> plan generation logic. Because of the way we have written the
> OperatorRouter API, we always need a stream-to-relation operator at the
> input. But Calcite generates a query plan like following:
>
> LogicalDelta
>   LogicalProject(id=[$0], product=[$1], quantity=[$2])
>     LogicalFilter(condition=[>($2, 5)])
>
>       StreamScan(table=[[KAFKA, ORDERS]], fields=[[0, 1, 2]])
>
> If we consider LogicalFilter as a relation operator, we need something to
> convert input stream to a relation before sending the tuples downstream.
> In addition to this, there is a optimization where we consider filter
> operator as a tuple operator and have it between StreamScan and
> stream-to-relation operator as a way of reducing the amount of messages
> going downstream.
>
> Other scenario is windowed aggregates. Currently window spec is attached to
> the LogicalProject in query plan like following:
>
> LogicalProject(EXPR$0=[CASE(>(COUNT($2) OVER (ROWS BETWEEN 2 PRECEDING AND
> 2 FOLLOWING), 0), CAST($SUM0($2) OVER (ROWS BETWEEN 2 PRECEDING AND 2
> FOLLOWING)):INTEGER, null)])
>
> I wanted to know from them whether it is possible to move window operation
> just after the stream scan, so that it is compatible with our operator
> layer.
> May be there are better or easier ways to do this. So your comments are
> always welcome.
>
> Thanks
> Milinda
>
>
> [1]
>
> http://mail-archives.apache.org/mod_mbox/incubator-calcite-dev/201502.mbox/browser
>
> --
> Milinda Pathirage
>
> PhD Student | Research Assistant
> School of Informatics and Computing | Data to Insight Center
> Indiana University
>
> twitter: milindalakmal
> skype: milinda.pathirage
> blog: http://milinda.pathirage.org
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message