flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3943) Add support for EXCEPT (set minus)
Date Wed, 29 Jun 2016 23:51:12 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356206#comment-15356206
] 

ASF GitHub Bot commented on FLINK-3943:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2169#discussion_r69048813
  
    --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/batch/table/SetOperationsITCase.scala
---
    @@ -139,4 +154,105 @@ class UnionITCase(
         // Must fail. Tables are bound to different TableEnvironments.
         ds1.unionAll(ds2).select('c)
       }
    +
    +  @Test
    +  def testSetMinusAll(): Unit = {
    +    val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment
    +    val tEnv = TableEnvironment.getTableEnvironment(env, config)
    +
    +    val ds1 = CollectionDataSets.getSmall3TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c)
    +    val ds2 = CollectionDataSets.getOneElement3TupleDataSet(env).toTable(tEnv, 'a, 'b,
'c)
    +
    +    val minusDs = ds1.minusAll(ds2).select('c)
    +
    +    val results = minusDs.toDataSet[Row].collect()
    +    val expected = "Hello\n" + "Hello world\n"
    +    TestBaseUtils.compareResultAsText(results.asJava, expected)
    +  }
    +
    +  @Test
    +  def testSetMinusAllWithDuplicates(): Unit = {
    +    val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment
    +    val tEnv = TableEnvironment.getTableEnvironment(env, config)
    +
    +    val ds1 = CollectionDataSets.getSmall3TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c)
    +    val ds2 = CollectionDataSets.getSmall3TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c)
    +    val ds3 = CollectionDataSets.getOneElement3TupleDataSet(env).toTable(tEnv, 'a, 'b,
'c)
    +
    +    val minusDs = ds1.unionAll(ds2).minusAll(ds3).select('c)
    +
    +    val results = minusDs.toDataSet[Row].collect()
    +    val expected = "Hello\n" + "Hello world\n" +
    +      "Hello\n" + "Hello world\n"
    +    TestBaseUtils.compareResultAsText(results.asJava, expected)
    +  }
    +
    +  @Test
    +  def testSetMinus(): Unit = {
    --- End diff --
    
    Can you combine this and the next test by using test data that covers both cases for different
records, i.e., have some records with duplicates in this first, second, none, and both data
sets.


> Add support for EXCEPT (set minus)
> ----------------------------------
>
>                 Key: FLINK-3943
>                 URL: https://issues.apache.org/jira/browse/FLINK-3943
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>    Affects Versions: 1.1.0
>            Reporter: Fabian Hueske
>            Assignee: Ivan Mushketyk
>            Priority: Minor
>
> Currently, the Table API and SQL do not support EXCEPT.
> EXCEPT can be executed as a coGroup on all fields that forwards records of the first
input if the second input is empty.
> In order to add support for EXCEPT to the Table API and SQL we need to:
> - Implement a {{DataSetMinus}} class that translates an EXCEPT into a DataSet API program
using a coGroup on all fields.
> - Implement a {{DataSetMinusRule}} that translates a Calcite {{LogicalMinus}} into a
{{DataSetMinus}}.
> - Extend the Table API (and validation phase) to provide an except() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message