flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2946) Add orderBy() to Table API
Date Thu, 24 Mar 2016 11:47:25 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210118#comment-15210118

Fabian Hueske commented on FLINK-2946:

Hi [~dawidwys], thanks a lot for working on this issue!
I had a look at your branch. You're definitely on the right track. Here are a few comments:

- The Table API syntax looks good
- In {{Table.orderBy()}} you should not extract aggregations, etc. Instead check that the
expressions match the following patterns ({{Table.as()}} does similar checks):
-- {{UnresolvedFieldReference}}
-- {{Asc(UnresolvedFieldReference)}}
-- {{Desc(UnresolvedFieldReference)}}
-- We can add support for more complex expressions and order by position later.
- Add asc() to {{RexNodeTranslator}}
- I just realized that Flink's range partitioning lacks support to define sort orders for
partition keys. We need to add this to make global sorting work correctly. I added FLINK-3665
to address this issue.
- We do not need to range partition if the parallelism of the input is 1 (check {{inputDs.getParallelism()
== 1}})

I'll be out for vacation for about two weeks. Not sure if I can follow up on this until I
am back.

> Add orderBy() to Table API
> --------------------------
>                 Key: FLINK-2946
>                 URL: https://issues.apache.org/jira/browse/FLINK-2946
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API
>            Reporter: Timo Walther
>            Assignee: Dawid Wysakowicz
> In order to implement a FLINK-2099 prototype that uses the Table APIs code generation
facilities, the Table API needs a sorting feature.
> I would implement it the next days. Ideas how to implement such a sorting feature are
very welcome. Is there any more efficient way instead of {{.sortPartition(...).setParallism(1)}}?
Is it better to sort locally on the nodes first and finally sort on one node afterwards?

This message was sent by Atlassian JIRA

View raw message