spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Reitmeyer <Richard_Reitme...@symantec.com.INVALID>
Subject apache-spark Structured Stateful Streaming with window / SPARK-21641
Date Tue, 15 Oct 2019 21:36:06 GMT
What’s the right way use Structured Streaming with both state and windows?

Looking at the slides from https://www.slideshare.net/databricks/arbitrary-stateful-aggregations-using-structured-streaming-in-apache-spark
 slides 26 and 31, it looks like stateful processing events for every device every minute
should be

events
  .withWatermark(“event_time”, “2 minutes”)
  .groupBy(“device_id”, window(“event_time”, “1 minute”))
  .flatMapWithState(…)(…)
  …

But with Apache Spark 2.4.4 this won’t work and it looks like https://issues.apache.org/jira/browse/SPARK-21641
is to blame.

What’s the recommended way to handle this?


Mime
View raw message