spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <rb...@netflix.com.INVALID>
Subject Re: time for Apache Spark 3.0?
Date Thu, 06 Sep 2018 16:47:38 GMT
I definitely support moving to 3.0 to remove deprecations and update
dependencies.

For the v2 work, we know that there will be a major API changes and
standardization of behavior from the new logical plans going into the next
release. I think it is a safe bet that this isn’t going to be completely
done for the next release, so it will still be experimental or unstable for
3.0. I also expect that there will be some things that we want to
deprecate. Ideally, that deprecation could happen before a major release so
we can remove it.

I don’t have a problem releasing 3.0 with an unstable v2 API or targeting
4.0 to remove behavior and APIs replaced by v2. But, I want to make sure we
consider it when deciding what the next release should be.

It is probably better to release 3.0 now because it isn’t clear when the v2
API will become stable. And if we choose to release 3.0 next, we should
*not* aim to stabilize v2 for that release. Not that we shouldn’t try to
make it stable as soon as possible, I just think that it is unlikely to
happen in time and we should not rush to claim it is stable.

rb

On Thu, Sep 6, 2018 at 9:31 AM Sean Owen <srowen@gmail.com> wrote:

> I think this doesn't necessarily mean 3.0 is coming soon (thoughts on
> timing? 6 months?) but simply next. Do you mean you'd prefer that change to
> happen before 3.x? if it's a significant change, seems reasonable for a
> major version bump rather than minor. Is the concern that tying it to 3.0
> means you have to take a major version update to get it?
>
> I generally support moving on to 3.x so we can also jettison a lot of
> older dependencies, code, fix some long standing issues, etc.
>
> (BTW Scala 2.12 support, mentioned in the OP, will go in for 2.4)
>
> On Thu, Sep 6, 2018 at 9:10 AM Ryan Blue <rblue@netflix.com.invalid>
> wrote:
>
>> My concern is that the v2 data source API is still evolving and not very
>> close to stable. I had hoped to have stabilized the API and behaviors for a
>> 3.0 release. But we could also wait on that for a 4.0 release, depending on
>> when we think that will be.
>>
>> Unless there is a pressing need to move to 3.0 for some other area, I
>> think it would be better for the v2 sources to have a 2.5 release.
>>
>> On Thu, Sep 6, 2018 at 8:59 AM Xiao Li <gatorsmile@gmail.com> wrote:
>>
>>> Yesterday, the 2.4 branch was created. Based on the above discussion, I
>>> think we can bump the master branch to 3.0.0-SNAPSHOT. Any concern?
>>>
>>>

-- 
Ryan Blue
Software Engineer
Netflix

Mime
View raw message