flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] hequn8128 commented on a change in pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream
Date Sun, 26 Aug 2018 12:23:12 GMT
hequn8128 commented on a change in pull request #6521: [FLINK-5315][table] Adding support for
distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r212824390
 
 

 ##########
 File path: docs/dev/table/tableApi.md
 ##########
 @@ -381,6 +381,36 @@ Table result = orders
 {% highlight java %}
 Table orders = tableEnv.scan("Orders");
 Table result = orders.distinct();
+{% endhighlight %}
+        <p><b>Note:</b> For streaming queries the required state to compute
the query result might grow infinitely depending on the number of distinct fields. Please
provide a query configuration with valid retention interval to prevent excessive state size.
See <a href="streaming.html">Streaming Concepts</a> for details.</p>
+      </td>
+    </tr>
+    <tr>
+      <td>
+        <strong>Distinct Aggregation</strong><br>
+        <span class="label label-primary">Streaming</span>
 
 Review comment:
   Add batch label? It seems batch already support distinct. BTW, over is not yet supported
in batch.
   
   Maybe it is better to remove these documents about distinct aggregation? Append a distinct
column in all Aggregation examples?
   For `GroupBy Aggregation`, change the sql 
   from
   ```
   Table orders = tableEnv.scan("Orders");
   Table result = orders.groupBy("a").select("a, b.sum as d");
   ```
   to
   ```
   Table orders = tableEnv.scan("Orders");
   Table result = orders.groupBy("a").select("a, b.sum as d, b.sum.distinct as e");
   ```
   What's more, we probably should add UDAGG document like it in SQL document about Aggregations
and add distinct column.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message