flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1444) Add data properties for data sources
Date Fri, 13 Feb 2015 08:20:12 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319731#comment-14319731

ASF GitHub Bot commented on FLINK-1444:

Github user rmetzger commented on the pull request:

    +1 to merge

> Add data properties for data sources
> ------------------------------------
>                 Key: FLINK-1444
>                 URL: https://issues.apache.org/jira/browse/FLINK-1444
>             Project: Flink
>          Issue Type: New Feature
>          Components: Java API, JobManager, Optimizer
>    Affects Versions: 0.9
>            Reporter: Fabian Hueske
>            Assignee: Fabian Hueske
>            Priority: Minor
> This issue proposes to add support for attaching data properties to data sources. These
data properties are defined with respect to input splits.
> Possible properties are:
> - partitioning across splits: all elements of the same key (combination) are contained
in one split
> - sorting / grouping with splits: elements are sorted or grouped on certain keys within
a split
> - key uniqueness: a certain key (combination) is unique for all elements of the data
source. This property is not defined wrt. input splits.
> The optimizer can leverage this information to generate more efficient execution plans.
> The InputFormat will be responsible to generate input splits such that the promised data
properties are actually in place. Otherwise, the program will produce invalid results. 

This message was sent by Atlassian JIRA

View raw message