flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9947) Document unified table sources/sinks/formats
Date Thu, 02 Aug 2018 10:05:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566575#comment-16566575
] 

ASF GitHub Bot commented on FLINK-9947:
---------------------------------------

fhueske commented on a change in pull request #6456: [FLINK-9947] [docs] Document unified
table sources/sinks/formats
URL: https://github.com/apache/flink/pull/6456#discussion_r207141849
 
 

 ##########
 File path: docs/dev/table/connect.md
 ##########
 @@ -67,14 +67,24 @@ This table is only available for stable releases.
 Overview
 --------
 
-Beginning from Flink 1.6, the declaration of a connection to an external system is separated
from the actual implementation. Connections can be specified either
+Beginning from Flink 1.6, the declaration of a connection to an external system is separated
from the actual implementation.
+
+Connections can be specified either
 
 - **programmatically** using a `Descriptor` under `org.apache.flink.table.descriptors` for
Table & SQL API
 - or **declaratively** via [YAML configuration files](http://yaml.org/) for the SQL Client.
 
-This allows not only for better unification of APIs and SQL Client but also for better extensibility
in case of [custom implementations](sourceSinks.html) without changing the declaration.
+This allows not only for better unification of APIs and SQL Client but also for better extensibility
in case of [custom implementations](sourceSinks.html) without changing the actual declaration.
+
+Every declaration is similar to a SQL `CREATE TABLE` statement. One can define the name of
the table, the final schema of the table, a connector, and a data format upfront for connecting
to an external system.
+
+The **connector** describes the external system that should be used as a source and/or target
of data. Storage systems such as [Apacha Kafka](http://kafka.apache.org/) or a regular file
system can be declared here. The connector might already provide a fixed format with fields
and schema.
+
+Some systems support different **data formats**. For example, one can encode the rows of
a table in CSV, JSON, or Avro representation before writing them into a file. A database connector
might need the table schema here. Whether or not a storage system requires the definition
of a format, is documented for every [connector](connect.html#table-connectors). Different
systems also require different [types of formats](connect.html#table-formats) (e.g., column-oriented
formats vs. row-oriented formats). The documentation states which format types and connectors
are compatible.
 
-Similar to a SQL `CREATE TABLE` statement, one can define the name of the table, the final
schema of the table, connector, and a data format upfront for connecting to an external system.
Additionally, the table's type (source, sink, or both) and an update mode for streaming queries
can be specified:
+The **table schema** defines the schema of a table that is exposed to SQL queries. It forms
the interface between the "external world" and the "table world". The schema has access to
fields defined by the connector or format. The schema can use one or more fields for extracting
or ingesting [time attributes](streaming.html#time-attributes). If input fields have no determinstic
field order, the schema clearly defines field names, their order, and origin.
 
 Review comment:
   Not sure if I would talk about "external world" and "table world". This might raise some
questions.
   I'd rather use the terms that we introduced before and say that it describes how a source
maps the data format to the table schema and a sink vice versa.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Document unified table sources/sinks/formats
> --------------------------------------------
>
>                 Key: FLINK-9947
>                 URL: https://issues.apache.org/jira/browse/FLINK-9947
>             Project: Flink
>          Issue Type: Improvement
>          Components: Documentation, Table API &amp; SQL
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>            Priority: Major
>              Labels: pull-request-available
>
> The recent unification of table sources/sinks/formats needs documentation. I propose
a new page that explains the built-in sources, sinks, and formats as well as a page for customization
of public interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message