flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [flink] bowenli86 commented on issue #8007: [FLINK-11474][table] Add ReadableCatalog, ReadableWritableCatalog, and other …
Date Thu, 11 Apr 2019 07:45:50 GMT
bowenli86 commented on issue #8007: [FLINK-11474][table] Add ReadableCatalog, ReadableWritableCatalog,
and other …
URL: https://github.com/apache/flink/pull/8007#issuecomment-482005765
 
 
   Hi all,
   
   I believe @sunjincheng121 and @hequn8128 brought up valuable suggestions to avoid API name
confusions. Xuefu also made very good points in consideration of API design and impl, and
that javadoc should be the true for understanding APIs.
   
   Previously I may be more affected by Hive's design given I've been working heavily on integrating
Flink-Hive. @sunjincheng121 's concerns, if I understand correctly, may come from that these
APIs will be used by not only SQL users but also Table API users, who may not have Hive backgrounds
and thus easier to get confused. Thus I tried to step out of Hive context, and inspect these
APIs from the perspective of their usage, as well as referencing MySQL, Postgres, Oracle,
SQL Server, and Hive. Here are my thoughts:
   
   On the reading side, view is always treated as a logical table. In queries (SELECT in standard
SQL DML), view is table - 'FROM' clause is always "FROM x" rather than "FROM `TABLE/VIEW`
x". It's planner's responsibility to process views specially. Meta commands as well, if with
no extra params - "DESCRIBE" doesn't distinguish them; Listing tables usually goes in two
syntax, "SHOW TABLES" and "SELECT * FROM meta", they return both tables and views, listing
only views would be different commands or with extra params like "SHOW VIEWS" and "SELECT
* FROM meta WHERE type='view'"
   
   On the writing side, view is treated differently from table, given representations of view
and table are a bit different (though they share some common fields). DDL, especially CREATE
and ALTER,  are always requires specifying either `TABLE` or `VIEW` as "CREATE/ALTER `TABLE/VIEW`
x". "DROP/RENAME" don't touch fields inside table and view, thus their impl behind the scene
are usually the same, and therefore some databases choose to not require the `TABLE/VIEW`
keyword, but I think it really depends on the developers. Since our devs feel strongly that
it causes confusions, we can requires the keywords in our APIs and Flink SQL.
   
   I think we should avoid design in which a SQL statement is translated into multiple catalog
API calls or requires unnecessary extra processing. With that in mind, and also given the
above conclusions (please correct me if there's anything above is wrong), I propose the following
solution:`ReadableCatalog` APIs should treat views as tables by default if no extra params
specified, thus `getTable()` and `listTables()` operate on both table and view, and we will
have individual APIs as `listPhysicalTables()`, `listViews()`, and potentially `listMaterializedViews()`
in the future. `ReadableWritableCatalog` APIs should treat views and tables differently, thus
have create/alter/drop/rename APIs separately for view and table. E.g. dropTable() and dropView(),
even though the two will very likely share the same code. We will also add clear javadoc and
Flink documentations for all catalog APIs in a separate PR. This way, we can eliminate confusions
and still maintain a 1 on 1 mapping between SQL statements and catalog APIs.
   
   What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message