flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6442) Extend TableAPI Support Sink Table Registration and ‘insert into’ Clause in SQL
Date Fri, 21 Jul 2017 09:56:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096073#comment-16096073

ASF GitHub Bot commented on FLINK-6442:

Github user fhueske commented on the issue:

    Hi @lincoln-lil, that's very good input!
    What do you think about the following? We keep the current `TableSink` interface, but
when registering a `TableSink` in a `TableEnvironment` we request field types (and optionally
field names). Internally, the `TableEnvironment` calls `configure()` and stores the returned
configured copy of the `TableSink` in the catalog. This would have the benefits that
    - we use the existing interface in a clean way and only need to update the documentation
to explain both modes to use the interface.
    - the same `TableSink` implementation can be used with eager and lazy schema registration.
    I agree with your proposal for `writeToSink` and `insertInto`. So the method signature
would be `Table.insertInto(tableSink: String, config: QueryConfig): Unit`, where `tableSink`
would be the name of a registered `TableSink`.
    Regarding the names of the methods I'm not sure how well-known the distinction of `SQL`,
`DML` and `DDL` is. You are of course right that `SELECT` and `INSERT` are part of `DML` (but
also part of SQL which is the superset of `DML` and `DDL`). 
    I think SQL is just better known than `DML` and many users might not be know what `DML`
    I'd propose the following two methods:
    - `sqlInsert(query: String, config: QueryConfig): Unit` and 
    - `sqlSelect(query: String): Table` (we can add `sqlSelect` and deprecate `sql`).
    Does that make sense to you?
    Best, Fabian

> Extend TableAPI Support Sink Table Registration and ‘insert into’ Clause in SQL
> -------------------------------------------------------------------------------
>                 Key: FLINK-6442
>                 URL: https://issues.apache.org/jira/browse/FLINK-6442
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>            Reporter: lincoln.lee
>            Assignee: lincoln.lee
>            Priority: Minor
> Currently in TableAPI  there’s only registration method for source table,  when we
use SQL writing a streaming job, we should add additional part for the sink, like TableAPI
> {code}
> val sqlQuery = "SELECT * FROM MyTable WHERE _1 = 3"
> val t = StreamTestData.getSmall3TupleDataStream(env)
> tEnv.registerDataStream("MyTable", t)
> // one way: invoke tableAPI’s writeToSink method directly
> val result = tEnv.sql(sqlQuery)
> result.writeToSink(new YourStreamSink)
> // another way: convert to datastream first and then invoke addSink 
> val result = tEnv.sql(sqlQuery).toDataStream[Row]
> result.addSink(new StreamITCase.StringSink)
> {code}
> From the api we can see the sink table always be a derived table because its 'schema'
is inferred from the result type of upstream query.
> Compare to traditional RDBMS which support DML syntax, a query with a target output could
be written like this:
> {code}
> insert into table target_table_name
> [(column_name [ ,...n ])]
> query
> {code}
> The equivalent form of the example above is as follows:
> {code}
>     tEnv.registerTableSink("targetTable", new YourSink)
>     val sql = "INSERT INTO targetTable SELECT a, b, c FROM sourceTable"
>     val result = tEnv.sql(sql)
> {code}
> It is supported by Calcite’s grammar: 
> {code}
>  insert:( INSERT | UPSERT ) INTO tablePrimary
>  [ '(' column [, column ]* ')' ]
>  query
> {code}
> I'd like to extend Flink TableAPI to support such feature.  see design doc: https://goo.gl/n3phK5

This message was sent by Atlassian JIRA

View raw message