flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aljoscha Krettek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3738) Refactor TableEnvironment and TranslationContext
Date Wed, 13 Apr 2016 08:43:25 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238864#comment-15238864

Aljoscha Krettek commented on FLINK-3738:

What do you mean by this? "DataSet and DataStream need a reference to a TableEnvironment to
be converted into a Table. This will prohibit implicit casts as currently supported for the
DataSet Scala API."

Will we add methods to convert to Table directly on DataSet/DataStream?

> Refactor TableEnvironment and TranslationContext
> ------------------------------------------------
>                 Key: FLINK-3738
>                 URL: https://issues.apache.org/jira/browse/FLINK-3738
>             Project: Flink
>          Issue Type: Task
>          Components: Table API
>            Reporter: Fabian Hueske
>            Assignee: Fabian Hueske
> Currently the TableAPI uses a static object called {{TranslationContext}} which holds
the Calcite table catalog and a Calcite planner instance. Whenever a {{DataSet}} or {{DataStream}}
is converted into a {{Table}} or registered as a {{Table}} on the {{TableEnvironment}}, a
new entry is added to the catalog. The first time a {{Table}} is added, a planner instance
is created. The planner is used to optimize the query (defined by one or more Table API operations
and/or one ore more SQL queries) when a {{Table}} is converted into a {{DataSet}} or {{DataStream}}.
Since a planner may only be used to optimize a single program, the choice of a single static
object is problematic.
> I propose to refactor the {{TableEnvironment}} to take over the responsibility of holding
the catalog and the planner instance. 
> - A {{TableEnvironment}} holds a catalog of registered tables and a single planner instance.
> - A {{TableEnvironment}} will only allow to translate a single {{Table}} (possibly composed
of several Table API operations and SQL queries) into a {{DataSet}} or {{DataStream}}. 
> - A {{TableEnvironment}} is bound to an {{ExecutionEnvironment}} or a {{StreamExecutionEnvironment}}.
This is necessary to create data source or source functions to read external tables or streams.
> - {{DataSet}} and {{DataStream}} need a reference to a {{TableEnvironment}} to be converted
into a {{Table}}. This will prohibit implicit casts as currently supported for the DataSet
Scala API.
> - A {{Table}} needs a reference to the {{TableEnvironment}} it is bound to. Only tables
from the same {{TableEnvironment}} can be processed together.
> - The {{TranslationContext}} will be completely removed.

This message was sent by Atlassian JIRA

View raw message