flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2946) Add orderBy() to Table API
Date Fri, 01 Apr 2016 22:15:25 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222431#comment-15222431

Fabian Hueske commented on FLINK-2946:

Hi Dawid,

at the moment, the Table API does not allow to configure the parallelism of the execution
plan. All Table API operators (incl. sorting) should be executed with the default parallelism
as defined in the {{ExecutionEnvironment}}. Hence, you are right. We should not look at the
parallelism of the preceding task but at the parallelism of the environment.

You can get the {{ExecutionEnvironment}} from the input {{DataSet}} by calling {{getExecutionEnvironment}}.
The {{ExecutionEnvironment}} has a method {{getParallelism}}. In case the parallelism was
not explicitly defined ({{getParallelism()}} returns {{-1}}) we should add the range partitioner.

> Add orderBy() to Table API
> --------------------------
>                 Key: FLINK-2946
>                 URL: https://issues.apache.org/jira/browse/FLINK-2946
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API
>            Reporter: Timo Walther
>            Assignee: Dawid Wysakowicz
> In order to implement a FLINK-2099 prototype that uses the Table APIs code generation
facilities, the Table API needs a sorting feature.
> I would implement it the next days. Ideas how to implement such a sorting feature are
very welcome. Is there any more efficient way instead of {{.sortPartition(...).setParallism(1)}}?
Is it better to sort locally on the nodes first and finally sort on one node afterwards?

This message was sent by Atlassian JIRA

View raw message