spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonu Jyotshna <>
Subject Multiple column aggregations
Date Sat, 09 Feb 2019 04:46:49 GMT

I have a requirement where I need to group by multiple columns and
aggregate them not at same time .. I mean I have a structure which contains
accountid, some cols, order id . I need to calculate some scenarios like
account having multiple orders so group by account and aggregate will work
here but I need to find orderid associated to multiple accounts so may be
group by orderid will work here but for better performance on the dataset
level can we do in single step? Where both will work or any better approach
I can follow . Can you help


View raw message