flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1293) Add support for out-of-place aggregations
Date Mon, 01 Dec 2014 14:56:12 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229874#comment-14229874
] 

ASF GitHub Bot commented on FLINK-1293:
---------------------------------------

Github user rmetzger commented on the pull request:

    https://github.com/apache/incubator-flink/pull/243#issuecomment-65075505
  
    Thank you for reviewing the pull request!
    I would suggest to first merge the change and then start improving it by changing the
internal types.
    
    Regarding the 2. point (POJO support). I have once started working on that. I think the
best approach would be to add a method to the serializers which allows to access fields.
    By doing so, we can re-use the existing code for analyzing (nested) POJO fields. Also,
it would generalize between Tuples and POJOs and avoid special casing between Tuples and POJOs
(it is trivial to implement field access for Tuples in the Tuple serializers)
    
    Once the serialisation for POJOs has become more efficient, we can actually throw away
the Tuple specific code and handle them like POJOs.


> Add support for out-of-place aggregations
> -----------------------------------------
>
>                 Key: FLINK-1293
>                 URL: https://issues.apache.org/jira/browse/FLINK-1293
>             Project: Flink
>          Issue Type: Improvement
>          Components: Java API, Scala API
>    Affects Versions: 0.7.0-incubating
>            Reporter: Viktor Rosenfeld
>            Assignee: Viktor Rosenfeld
>            Priority: Minor
>
> Currently, the output of an aggregation is of the same type as the input. This restriction
has to major drawbacks:
> 1. Every tuple field can only be used in one aggregation because the aggregations result
is stored in the field.
> 2. Aggregations having a return type that is different from the input type, e.g., count
or average, cannot be implemented.
> It would be nice to have the aggregation return any kind of tuple as a result, so the
restrictions above no longer apply.
> See also:
> - https://github.com/stratosphere/stratosphere/wiki/Design-of-Aggregate-Operator
> - http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Hi-Aggregation-support-td2311.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message