calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <>
Subject Re: DISTINCT ON clause
Date Wed, 21 Feb 2018 21:31:01 GMT
I can see that it would be useful. But there are a couple of things about it that are messy.

I think the syntax is a bit clunky because it uses parentheses; a query would be difficult
to read if people would like multiple columns, expressions, and aliases.

I also think the semantics are messy. If you read
<> you will see
that “DISTINCT ON” is evaluated after the ORDER BY clause, whereas regular DISTINCT is
evaluated before the ORDER BY clause. So there will be a bizarre interaction if DISTINCT ON
is used with UNION and ORDER BY.

By the way, good ol’ MySQL doesn’t have this problem; you can just write


and it just picks the first value of b and c. This “feature” is one of my least favorite
things about MySQL, so let’s not emulate it. :)


> On Feb 21, 2018, at 12:44 PM, Aman Sinha <> wrote:
> The DISTINCT 'ON' clause is not supported but I am trying to see if there
> are other people who have run into this.  One of the use cases I have
> intuitively maps to something like this:
>  SELECT DISTINCT *ON (a)*,  b, c  FROM T
> Here  suppose 'a' is an INT  and b, c are some complex types such as
> array.  In my example, the values of  b and c happen to be the same in all
> rows belonging to the same group of 'a', so I just want the first row.
> Since these are arrays, I cannot use MIN(b) GROUP BY a .   Other
> alternative is to create a new aggregate function that picks the first
> value but it makes the syntax verbose especially with large number of such
> columns.
> Incidentally, Postgres supports this [1]
> [1]
> -Aman

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message