spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dongjoon Hyun <dongjoon.h...@gmail.com>
Subject Re: [Proposal] Modification to Spark's Semantic Versioning Policy
Date Sat, 07 Mar 2020 19:30:39 GMT
+1 for Sean's concerns and questions.

Bests,
Dongjoon.

On Fri, Mar 6, 2020 at 3:14 PM Sean Owen <srowen@gmail.com> wrote:

> This thread established some good general principles, illustrated by a few
> good examples. It didn't draw specific conclusions about what to add back,
> which is why it wasn't at all controversial. What it means in specific
> cases is where there may be disagreement, and that harder question hasn't
> been addressed.
>
> The reverts I have seen so far seemed like the obvious one, but yes, there
> are several more going on now, some pretty broad. I am not even sure what
> all of them are. In addition to below,
> https://github.com/apache/spark/pull/27839. Would it be too much overhead
> to post to this thread any changes that one believes are endorsed by these
> principles and perhaps a more strict interpretation of them now? It's
> important enough we should get any data points or input, and now. (We're
> obviously not going to debate each one.) A draft PR, or several, actually
> sounds like a good vehicle for that -- as long as people know about them!
>
> Also, is there any usage data available to share? many arguments turn
> around 'commonly used' but can we know that more concretely?
>
> Otherwise I think we'll back into implementing personal interpretations of
> general principles, which is arguably the issue in the first place, even
> when everyone believes in good faith in the same principles.
>
>
>
> On Fri, Mar 6, 2020 at 1:08 PM Dongjoon Hyun <dongjoon.hyun@gmail.com>
> wrote:
>
>> Hi, All.
>>
>> Recently, reverting PRs seems to start to spread like the *well-known*
>> virus.
>> Can we finalize this first before doing unofficial personal decisions?
>> Technically, this thread was not a vote and our website doesn't have a
>> clear policy yet.
>>
>> https://github.com/apache/spark/pull/27821
>> [SPARK-25908][SQL][FOLLOW-UP] Add Back Multiple Removed APIs
>>     ==> This technically revert most of the SPARK-25908.
>>
>> https://github.com/apache/spark/pull/27835
>> Revert "[SPARK-25457][SQL] IntegralDivide returns data type of the
>> operands"
>>
>> https://github.com/apache/spark/pull/27834
>> Revert [SPARK-24640][SQL] Return `NULL` from `size(NULL)` by default
>>
>> Bests,
>> Dongjoon.
>>
>> On Thu, Mar 5, 2020 at 9:08 PM Dongjoon Hyun <dongjoon.hyun@gmail.com>
>> wrote:
>>
>>> Hi, All.
>>>
>>> There is a on-going Xiao's PR referencing this email.
>>>
>>> https://github.com/apache/spark/pull/27821
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Fri, Feb 28, 2020 at 11:20 AM Sean Owen <srowen@gmail.com> wrote:
>>>
>>>> On Fri, Feb 28, 2020 at 12:03 PM Holden Karau <holden@pigscanfly.ca>
>>>> wrote:
>>>> >>     1. Could you estimate how many revert commits are required in
>>>> `branch-3.0` for new rubric?
>>>>
>>>> Fair question about what actual change this implies for 3.0? so far it
>>>> seems like some targeted, quite reasonable reverts. I don't think
>>>> anyone's suggesting reverting loads of changes.
>>>>
>>>>
>>>> >>     2. Are you going to revert all removed test cases for the
>>>> deprecated ones?
>>>> > This is a good point, making sure we keep the tests as well is
>>>> important (worse than removing a deprecated API is shipping it broken),.
>>>>
>>>> (I'd say, yes of course! which seems consistent with what is happening
>>>> now)
>>>>
>>>>
>>>> >>     3. Does it make any delay for Apache Spark 3.0.0 release?
>>>> >>         (I believe it was previously scheduled on June before Spark
>>>> Summit 2020)
>>>> >
>>>> > I think if we need to delay to make a better release this is ok,
>>>> especially given our current preview releases being available to gather
>>>> community feedback.
>>>>
>>>> Of course these things block 3.0 -- all the more reason to keep it
>>>> specific and targeted -- but nothing so far seems inconsistent with
>>>> finishing in a month or two.
>>>>
>>>>
>>>> >> Although there was a discussion already, I want to make the
>>>> following tough parts sure.
>>>> >>     4. We are not going to add Scala 2.11 API, right?
>>>> > I hope not.
>>>> >>
>>>> >>     5. We are not going to support Python 2.x in Apache Spark 3.1+,
>>>> right?
>>>> > I think doing that would be bad, it's already end of lifed elsewhere.
>>>>
>>>> Yeah this is an important subtext -- the valuable principles here
>>>> could be interpreted in many different ways depending on how much you
>>>> weight value of keeping APIs for compatibility vs value in simplifying
>>>> Spark and pushing users to newer APIs more forcibly. They're all
>>>> judgment calls, based on necessarily limited data about the universe
>>>> of users. We can only go on rare direct user feedback, on feedback
>>>> perhaps from vendors as proxies for a subset of users, and the general
>>>> good faith judgment of committers who have lived Spark for years.
>>>>
>>>> My specific interpretation is that the standard is (correctly)
>>>> tightening going forward, and retroactively a bit for 3.0. But, I do
>>>> not think anyone is advocating for the logical extreme of, for
>>>> example, maintaining Scala 2.11 compatibility indefinitely. I think
>>>> that falls out readily from the rubric here: maintaining 2.11
>>>> compatibility is really quite painful if you ever support 2.13 too,
>>>> for example.
>>>>
>>>

Mime
View raw message