spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: Spark 2.4.2
Date Wed, 17 Apr 2019 00:03:44 GMT
Thanks Ryan. To me the "test" for putting things in a maintenance release
is really a trade-off between benefit and risk (along with some caveats,
like user facing surface should not grow). The benefits here are fairly
large (now it is possible to plug in partition aware data sources) and the
risk is very low (no change in behavior by default).

And bugs aren't usually fixed with a configuration flag to turn on the fix.


Agree, this should be on by default in master. That would just tip the risk
balance for me in a maintenance release.

On Tue, Apr 16, 2019 at 4:55 PM Ryan Blue <rblue@netflix.com> wrote:

> Spark has a lot of strange behaviors already that we don't fix in patch
> releases. And bugs aren't usually fixed with a configuration flag to turn
> on the fix.
>
> That said, I don't have a problem with this commit making it into a patch
> release. This is a small change and looks safe enough to me. I was just a
> little surprised since I was expecting a correctness issue if this is
> prompting a release. I'm definitely on the side of case-by-case judgments
> on what to allow in patch releases and this looks fine.
>
> On Tue, Apr 16, 2019 at 4:27 PM Michael Armbrust <michael@databricks.com>
> wrote:
>
>> I would argue that its confusing enough to a user for options from
>> DataFrameWriter to be silently dropped when instantiating the data source
>> to consider this a bug.  They asked for partitioning to occur, and we are
>> doing nothing (not even telling them we can't).  I was certainly surprised
>> by this behavior.  Do you have a different proposal about how this should
>> be handled?
>>
>> On Tue, Apr 16, 2019 at 4:23 PM Ryan Blue <rblue@netflix.com> wrote:
>>
>>> Is this a bug fix? It looks like a new feature to me.
>>>
>>> On Tue, Apr 16, 2019 at 4:13 PM Michael Armbrust <michael@databricks.com>
>>> wrote:
>>>
>>>> Hello All,
>>>>
>>>> I know we just released Spark 2.4.1, but in light of fixing SPARK-27453
>>>> <https://issues.apache.org/jira/browse/SPARK-27453> I was wondering
if
>>>> it might make sense to follow up quickly with 2.4.2.  Without this fix its
>>>> very hard to build a datasource that correctly handles partitioning without
>>>> using unstable APIs.  There are also a few other fixes that have trickled
>>>> in since 2.4.1.
>>>>
>>>> If there are no objections, I'd like to start the process shortly.
>>>>
>>>> Michael
>>>>
>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Mime
View raw message