spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Mahadevan <ar...@apache.org>
Subject Re: can we use mapGroupsWithState in raw sql?
Date Wed, 18 Apr 2018 16:36:54 GMT
Cant the “max” function used here ? Something like..

stream.groupBy($"id").max("amount").writeStream.outputMode(“complete”/“update")….


Unless the “stream” is already a grouped stream, in which case the above would not work
since the support for multiple aggregate operations is not there yet.

Thanks,
Arun

From:  kant kodali <kanth909@gmail.com>
Date:  Tuesday, April 17, 2018 at 11:41 AM
To:  Tathagata Das <tathagata.das1565@gmail.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: can we use mapGroupsWithState in raw sql?

Hi TD, 

Thanks for that. The only reason I ask is I don't see any alternative solution to solve the
problem below using raw sql.


How to select the max row for every group in spark structured streaming 2.3.0 without using
order by since it requires complete mode or mapGroupWithState?

Input:
id | amount     | my_timestamp
-------------------------------------------
1  |      5     |  2018-04-01T01:00:00.000Z
1  |     10     |  2018-04-01T01:10:00.000Z
2  |     20     |  2018-04-01T01:20:00.000Z
2  |     30     |  2018-04-01T01:25:00.000Z
2  |     40     |  2018-04-01T01:30:00.000Z
Expected Output:
id | amount     | my_timestamp
-------------------------------------------
1  |     10     |  2018-04-01T01:10:00.000Z
2  |     40     |  2018-04-01T01:30:00.000Z
Looking for a streaming solution using either raw sql like sparkSession.sql("sql query") or
similar to raw sql but not something like mapGroupWithState


On Mon, Apr 16, 2018 at 8:32 PM, Tathagata Das <tathagata.das1565@gmail.com> wrote:
Unfortunately no. Honestly it does not make sense as for type-aware operations like map, mapGroups,
etc., you have to provide an actual JVM function. That does not fit in with the SQL language
structure.

On Mon, Apr 16, 2018 at 7:34 PM, kant kodali <kanth909@gmail.com> wrote:
Hi All, 

can we use mapGroupsWithState in raw SQL? or is it in the roadmap?

Thanks!






Mime
View raw message