spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-23911) High-order function: aggregate(array<T>, initialState S, inputFunction<S, T, S>, outputFunction<S, R>) → R
Date Wed, 08 Aug 2018 08:22:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-23911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572871#comment-16572871
] 

Apache Spark commented on SPARK-23911:
--------------------------------------

User 'ueshin' has created a pull request for this issue:
https://github.com/apache/spark/pull/22035

> High-order function: aggregate(array<T>, initialState S, inputFunction<S, T,
S>, outputFunction<S, R>) → R
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23911
>                 URL: https://issues.apache.org/jira/browse/SPARK-23911
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Assignee: Takuya Ueshin
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Ref: https://prestodb.io/docs/current/functions/array.html
> Returns a single value reduced from array. inputFunction will be invoked for each element
in array in order. In addition to taking the element, inputFunction takes the current state,
initially initialState, and returns the new state. outputFunction will be invoked to turn
the final state into the result value. It may be the identity function (i -> i).
> {noformat}
> SELECT aggregate(ARRAY [], 0, (s, x) -> s + x, s -> s); -- 0
> SELECT aggregate(ARRAY [5, 20, 50], 0, (s, x) -> s + x, s -> s); -- 75
> SELECT aggregate(ARRAY [5, 20, NULL, 50], 0, (s, x) -> s + x, s -> s); -- NULL
> SELECT aggregate(ARRAY [5, 20, NULL, 50], 0, (s, x) -> s + COALESCE(x, 0), s ->
s); -- 75
> SELECT aggregate(ARRAY [5, 20, NULL, 50], 0, (s, x) -> IF(x IS NULL, s, s + x), s
-> s); -- 75
> SELECT aggregate(ARRAY [2147483647, 1], CAST (0 AS BIGINT), (s, x) -> s + x, s ->
s); -- 2147483648
> SELECT aggregate(ARRAY [5, 6, 10, 20], -- calculates arithmetic average: 10.25
>               CAST(ROW(0.0, 0) AS ROW(sum DOUBLE, count INTEGER)),
>               (s, x) -> CAST(ROW(x + s.sum, s.count + 1) AS ROW(sum DOUBLE, count
INTEGER)),
>               s -> IF(s.count = 0, NULL, s.sum / s.count));
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message