flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3738) Refactor TableEnvironment and TranslationContext
Date Wed, 13 Apr 2016 09:40:25 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238964#comment-15238964

Fabian Hueske commented on FLINK-3738:

Yes, but these methods will require a {{TableEnvironment}}} parameter. 
So it will be necessary to explicitly convert a DataSet/DataStream to a Table ({{mySet.toTable(tEnv).where('a
=== 1)}} and it won't be possible to directly apply Table operations on a DataSet/DataStream
({{mySet.where('a === 1)}}).

> Refactor TableEnvironment and TranslationContext
> ------------------------------------------------
>                 Key: FLINK-3738
>                 URL: https://issues.apache.org/jira/browse/FLINK-3738
>             Project: Flink
>          Issue Type: Task
>          Components: Table API
>            Reporter: Fabian Hueske
>            Assignee: Fabian Hueske
> Currently the TableAPI uses a static object called {{TranslationContext}} which holds
the Calcite table catalog and a Calcite planner instance. Whenever a {{DataSet}} or {{DataStream}}
is converted into a {{Table}} or registered as a {{Table}} on the {{TableEnvironment}}, a
new entry is added to the catalog. The first time a {{Table}} is added, a planner instance
is created. The planner is used to optimize the query (defined by one or more Table API operations
and/or one ore more SQL queries) when a {{Table}} is converted into a {{DataSet}} or {{DataStream}}.
Since a planner may only be used to optimize a single program, the choice of a single static
object is problematic.
> I propose to refactor the {{TableEnvironment}} to take over the responsibility of holding
the catalog and the planner instance. 
> - A {{TableEnvironment}} holds a catalog of registered tables and a single planner instance.
> - A {{TableEnvironment}} will only allow to translate a single {{Table}} (possibly composed
of several Table API operations and SQL queries) into a {{DataSet}} or {{DataStream}}. 
> - A {{TableEnvironment}} is bound to an {{ExecutionEnvironment}} or a {{StreamExecutionEnvironment}}.
This is necessary to create data source or source functions to read external tables or streams.
> - {{DataSet}} and {{DataStream}} need a reference to a {{TableEnvironment}} to be converted
into a {{Table}}. This will prohibit implicit casts as currently supported for the DataSet
Scala API.
> - A {{Table}} needs a reference to the {{TableEnvironment}} it is bound to. Only tables
from the same {{TableEnvironment}} can be processed together.
> - The {{TranslationContext}} will be completely removed.

This message was sent by Atlassian JIRA

View raw message