flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1444) Add data properties for data sources
Date Fri, 13 Feb 2015 08:20:12 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319731#comment-14319731
] 

ASF GitHub Bot commented on FLINK-1444:
---------------------------------------

Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/379#issuecomment-74219481
  
    +1 to merge


> Add data properties for data sources
> ------------------------------------
>
>                 Key: FLINK-1444
>                 URL: https://issues.apache.org/jira/browse/FLINK-1444
>             Project: Flink
>          Issue Type: New Feature
>          Components: Java API, JobManager, Optimizer
>    Affects Versions: 0.9
>            Reporter: Fabian Hueske
>            Assignee: Fabian Hueske
>            Priority: Minor
>
> This issue proposes to add support for attaching data properties to data sources. These
data properties are defined with respect to input splits.
> Possible properties are:
> - partitioning across splits: all elements of the same key (combination) are contained
in one split
> - sorting / grouping with splits: elements are sorted or grouped on certain keys within
a split
> - key uniqueness: a certain key (combination) is unique for all elements of the data
source. This property is not defined wrt. input splits.
> The optimizer can leverage this information to generate more efficient execution plans.
> The InputFormat will be responsible to generate input splits such that the promised data
properties are actually in place. Otherwise, the program will produce invalid results. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message