calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Solimando <alessandro.solima...@gmail.com>
Subject Re: Agg Call in CountSplitter / SumSplitter
Date Thu, 29 Mar 2018 12:40:32 GMT
Hi Haisgheng,
I agree with you, since the groupby on t1.b, by definition, produces one
tuple per distinct t1.b value, for any given t1.b distinct values you have
at most one tuple both sides, so the direct multiplication cnt*c suffice as
SUM0 sums over a single element.

But I don't know the logic behind the rule, so I cannot say if the
rewriting is there for other reasons (triggering other possible chains of
rewriting for instance), but definitely an interesting remark :)

Best regards,
Alessandro

On 29 March 2018 at 13:54, Haisheng Yuan <hsyuan1@gmail.com> wrote:

> Hi devs,
>
> While reading SqlSplittableAggFunction and AggregateJoinTransposeRule
> source
> code, I noticed that in CountSplitter.topSplit, it creates an SUM0 AggCall,
> which makes sense for the following query:
> select count(t1.a) from t1 join t2 on t1.b=t2.b;
>
> which will be transformed to:
> select sum0(cnt*c) from (select b, count(a) as cnt from t1 group by b) t1
> join (select b, count(*) as c from t2 group by b)t2 on t1.b=t2.b ;
>
> But for another query:
> select t1.b, count(t1.a) from t1 join t2 on t1.b=t2.b group by t1.b;
>
> which will be transformed to:
> select t1.b, sum0(cnt*c) from (select b, count(a) as cnt from t1 group by
> b) t1 join (select b, count(*) as c from t2 group by b)t2 on t1.b=t2.b
> group by t1.b;
>
> In fact, query without additional SUM0 group agg should be enough:
> select t1.b, cnt*c from (select b, count(a) as cnt from t1 group by b) t1
> join (select b, count(*) as c from t2 group by b) t2 on t1.b=t2.b;
>
> Am I missing something? Correct me if I am wrong.
>
> Thanks~
> Haisheng Yuan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message