spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: time for Apache Spark 3.0?
Date Thu, 06 Sep 2018 18:07:38 GMT
Yes, that is why we have these annotations in the code and the
corresponding labels appearing in the API documentation:
https://github.com/apache/spark/blob/master/common/tags/src/main/java/org/apache/spark/annotation/InterfaceStability.java

As long as it is properly annotated, we can change or even eliminate an API
method before the next major release. And frankly, we shouldn't be
contemplating bringing in the DS v2 API (and, I'd argue, *any* new API)
without such an annotation. There is just too much risk of not getting
everything right before we see the results of the new API being more widely
used, and too much cost in maintaining until the next major release
something that we come to regret for us to create new API in a fully frozen
state.


On Thu, Sep 6, 2018 at 9:49 AM Ryan Blue <rblue@netflix.com.invalid> wrote:

> It would be great to get more features out incrementally. For experimental
> features, do we have more relaxed constraints?
>
> On Thu, Sep 6, 2018 at 9:47 AM Reynold Xin <rxin@databricks.com> wrote:
>
>> +1 on 3.0
>>
>> Dsv2 stable can still evolve in across major releases. DataFrame,
>> Dataset, dsv1 and a lot of other major features all were developed
>> throughout the 1.x and 2.x lines.
>>
>> I do want to explore ways for us to get dsv2 incremental changes out
>> there more frequently, to get feedback. Maybe that means we apply additive
>> changes to 2.4.x; maybe that means making another 2.5 release sooner. I
>> will start a separate thread about it.
>>
>>
>>
>> On Thu, Sep 6, 2018 at 9:31 AM Sean Owen <srowen@gmail.com> wrote:
>>
>>> I think this doesn't necessarily mean 3.0 is coming soon (thoughts on
>>> timing? 6 months?) but simply next. Do you mean you'd prefer that change to
>>> happen before 3.x? if it's a significant change, seems reasonable for a
>>> major version bump rather than minor. Is the concern that tying it to 3.0
>>> means you have to take a major version update to get it?
>>>
>>> I generally support moving on to 3.x so we can also jettison a lot of
>>> older dependencies, code, fix some long standing issues, etc.
>>>
>>> (BTW Scala 2.12 support, mentioned in the OP, will go in for 2.4)
>>>
>>> On Thu, Sep 6, 2018 at 9:10 AM Ryan Blue <rblue@netflix.com.invalid>
>>> wrote:
>>>
>>>> My concern is that the v2 data source API is still evolving and not
>>>> very close to stable. I had hoped to have stabilized the API and behaviors
>>>> for a 3.0 release. But we could also wait on that for a 4.0 release,
>>>> depending on when we think that will be.
>>>>
>>>> Unless there is a pressing need to move to 3.0 for some other area, I
>>>> think it would be better for the v2 sources to have a 2.5 release.
>>>>
>>>> On Thu, Sep 6, 2018 at 8:59 AM Xiao Li <gatorsmile@gmail.com> wrote:
>>>>
>>>>> Yesterday, the 2.4 branch was created. Based on the above discussion,
>>>>> I think we can bump the master branch to 3.0.0-SNAPSHOT. Any concern?
>>>>>
>>>>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Mime
View raw message