spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungtaek Lim <kabhwan.opensou...@gmail.com>
Subject Re: What's the root cause of not supporting multiple aggregations in structured streaming?
Date Fri, 04 Sep 2020 06:06:50 GMT
Unfortunately I don't see enough active committers working on Structured
Streaming; I don't expect major features/improvements can be brought in
this situation.

Technically I can review and merge the PR on major improvements in SS, but
that depends on how huge the proposal is changing. If the proposal brings
conceptual change, being reviewed by a committer wouldn't still be enough.

So that's not due to the fact we think it's worthless. (That might be only
me though.) I'd understand as there's not much investment on SS. There's
also a known workaround for multiple aggregations (I've documented in the
SS guide doc, in "Limitation of global watermark" section), though I
totally agree the workaround is bad.

On Tue, Sep 1, 2020 at 12:28 AM Etienne Chauchot <echauchot@apache.org>
wrote:

> Hi all,
>
> I'm also very interested in this feature but the PR is open since January
> 2019 and was not updated. It raised a design discussion around watermarks
> and a design doc was written (
> https://docs.google.com/document/d/1IAH9UQJPUiUCLd7H6dazRK2k1szDX38SnM6GVNZYvUo/edit#heading=h.npkueh4bbkz1).
> We also commented this design but no matter what it seems that the subject
> is still stale.
>
> Is there any interest in the community in delivering this feature or is it
> considered worthless ? If the latter, can you explain why ?
>
> Best
>
> Etienne
> On 22/05/2019 03:38, 张万新 wrote:
>
> Thanks, I'll check it out.
>
> Arun Mahadevan <arunm@apache.org> 于 2019年5月21日周二 01:31写道:
>
>> Heres the proposal for supporting it in "append" mode -
>> https://github.com/apache/spark/pull/23576. You could see if it
>> addresses your requirement and post your feedback in the PR.
>> For "update" mode its going to be much harder to support this without
>> first adding support for "retractions", otherwise we would end up with
>> wrong results.
>>
>> - Arun
>>
>>
>> On Mon, 20 May 2019 at 01:34, Gabor Somogyi <gabor.g.somogyi@gmail.com>
>> wrote:
>>
>>> There is PR for this but not yet merged.
>>>
>>> On Mon, May 20, 2019 at 10:13 AM 张万新 <kevinzwx1992@gmail.com> wrote:
>>>
>>>> Hi there,
>>>>
>>>> I'd like to know what's the root reason why multiple aggregations on
>>>> streaming dataframe is not allowed since it's a very useful feature, and
>>>> flink has supported it for a long time.
>>>>
>>>> Thanks.
>>>>
>>>

Mime
View raw message