spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Somasundaram Sekar <>
Subject DataFrame multiple agg on the same column
Date Sat, 07 Oct 2017 17:12:17 GMT

I have a GroupedData object, on which I perform aggregation of few columns
since GroupedData takes in map, I cannot perform multiple aggregate on the
same column, say I want to have both max and min of amount.

So the below line of code will return only one aggregate per column

grouped_txn.agg({'*' : 'count', 'amount' : 'sum', 'amount' : 'max',
'created_time' : 'min', 'created_time' : 'max'})

What are the possible alternatives, I can have a new column defined, that
is just a copy of the original and use that, but that looks ugly any

Somasundaram S

View raw message