spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Javier Rey <jre...@gmail.com>
Subject Sum array values by row in new column
Date Mon, 15 Aug 2016 17:02:27 GMT
Hi everyone,

I have one dataframe with one column this column is an array of numbers,
how can I sum each array by row a obtain a new column with sum? in pyspark.

Example:

+------------+
|     numbers|
+------------+
|[10, 20, 30]|
|[40, 50, 60]|
|[70, 80, 90]|
+------------+

The idea is obtain the same df with a new column with totals:

+------------+------
|     numbers|     |
+------------+------
|[10, 20, 30]|60   |
|[40, 50, 60]|150  |
|[70, 80, 90]|240  |
+------------+------

Regards!

Samir

Mime
View raw message