flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lisonbee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3613) Add standard deviation, mean, variance to list of Aggregations
Date Fri, 18 Mar 2016 14:25:33 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201538#comment-15201538
] 

Todd Lisonbee commented on FLINK-3613:
--------------------------------------

Sure, I'll create a design for this.  Thanks.

> Add standard deviation, mean, variance to list of Aggregations
> --------------------------------------------------------------
>
>                 Key: FLINK-3613
>                 URL: https://issues.apache.org/jira/browse/FLINK-3613
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Todd Lisonbee
>            Priority: Minor
>
> Implement standard deviation, mean, variance for org.apache.flink.api.java.aggregation.Aggregations
> Ideally implementation should be single pass and numerically stable.
> References:
> "Scalable and Numerically Stable Descriptive Statistics in SystemML", Tian et al, International
Conference on Data Engineering 2012
> http://dl.acm.org/citation.cfm?id=2310392
> "The Kahan summation algorithm (also known as compensated summation) reduces the numerical
errors that occur when adding a sequence of finite precision floating point numbers. Numerical
errors arise due to truncation and rounding. These errors can lead to numerical instability
when calculating variance."
> https://en.wikipedia.org/wiki/Kahan_summation_algorithm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message