spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Mahadevan <ar...@apache.org>
Subject Re: Plans for Session Windows?
Date Thu, 09 Aug 2018 20:29:03 GMT
Not sure if it changed recently, but the beam spark runner still uses the
DStreams API. It need not necessarily use the higher level apis (like
mapGroupsWithState or equivalent) provided by spark, it could as well be
based on groupByKey or reduce primitives and the window functions provided
by beam.

On 9 August 2018 at 12:23, Mike Sukmanowsky <mike.sukmanowsky@gmail.com>
wrote:

> Thanks Arun. It's curious as Apache Beam says it fully supports
> <https://beam.apache.org/documentation/runners/capability-matrix/>
> session windows running on Spark but I'd imagine that, under the hood, it's
> leveraging mapGroupsWithState.
>
> Are there any plans to support the mapGroupsWithState API in PySpark?
>
> On Thu, 9 Aug 2018 at 13:13, Arun Mahadevan <arunm@apache.org> wrote:
>
>> There is no stock API to do this directly, but it can be implemented on
>> top of mapGroupWithState like here -
>>
>>
>> https://github.com/apache/spark/blob/master/examples/
>> src/main/scala/org/apache/spark/examples/sql/streaming/
>> StructuredSessionization.scala
>>
>>
>>
>> It would be worth bundling this into a builtin api, but AFAIK there are
>> no plans yet.
>>
>>
>>
>> Thanks,
>>
>> Arun
>>
>>
>> On 9 August 2018 at 08:02, Mike Sukmanowsky <mike.sukmanowsky@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> Just wondering if Spark has any plans to support session-based windows
>>> in structured streaming as documented by Apache Beam here
>>> <https://beam.apache.org/documentation/programming-guide/#provided-windowing-functions>
or
>>> perhaps better in this blog post
>>> <https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102>?
>>>
>>> [image: Figure-16---Session-Merging-a526d52186fcc33f3b5b5c59b176ac4e.jpg]
>>>
>>> --
>>> Mike Sukmanowsky
>>> Aspiring Digital Carpenter
>>>
>>> *e*: mike.sukmanowsky@gmail.com
>>>
>>> LinkedIn <http://www.linkedin.com/profile/view?id=10897143> | github
>>> <https://github.com/msukmanowsky>
>>>
>>>
>>
>
> --
> Mike Sukmanowsky
> Aspiring Digital Carpenter
>
> *e*: mike.sukmanowsky@gmail.com
>
> LinkedIn <http://www.linkedin.com/profile/view?id=10897143> | github
> <https://github.com/msukmanowsky>
>
>

Mime
View raw message