spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <rb...@netflix.com.INVALID>
Subject Re: Spark 2.4.2
Date Wed, 17 Apr 2019 15:54:29 GMT
Xiao, you're okay with deciding on inclusion in a patch release
case-by-case? In the past, you've vehemently advocated that absolutely no
patches that are not strictly bug fixes should go in. I think it is a good
step to accept this change, but I want to make sure you're conscious that
this clearly sets a new precedent.

On Tue, Apr 16, 2019 at 5:14 PM Xiao Li <lixiao@databricks.com> wrote:

> We also found two regressions that were introduced in the maintenance
> release 2.4.1. Below are the fixes that were recently merged:
> - https://github.com/apache/spark/pull/24359
> - https://github.com/apache/spark/pull/24329
>
>
> On Tue, Apr 16, 2019 at 5:04 PM Michael Armbrust <michael@databricks.com>
> wrote:
>
>> Thanks Ryan. To me the "test" for putting things in a maintenance release
>> is really a trade-off between benefit and risk (along with some caveats,
>> like user facing surface should not grow). The benefits here are fairly
>> large (now it is possible to plug in partition aware data sources) and the
>> risk is very low (no change in behavior by default).
>>
>> And bugs aren't usually fixed with a configuration flag to turn on the
>>> fix.
>>
>>
>> Agree, this should be on by default in master. That would just tip the
>> risk balance for me in a maintenance release.
>>
>> On Tue, Apr 16, 2019 at 4:55 PM Ryan Blue <rblue@netflix.com> wrote:
>>
>>> Spark has a lot of strange behaviors already that we don't fix in patch
>>> releases. And bugs aren't usually fixed with a configuration flag to turn
>>> on the fix.
>>>
>>> That said, I don't have a problem with this commit making it into a
>>> patch release. This is a small change and looks safe enough to me. I was
>>> just a little surprised since I was expecting a correctness issue if this
>>> is prompting a release. I'm definitely on the side of case-by-case
>>> judgments on what to allow in patch releases and this looks fine.
>>>
>>> On Tue, Apr 16, 2019 at 4:27 PM Michael Armbrust <michael@databricks.com>
>>> wrote:
>>>
>>>> I would argue that its confusing enough to a user for options from
>>>> DataFrameWriter to be silently dropped when instantiating the data source
>>>> to consider this a bug.  They asked for partitioning to occur, and we are
>>>> doing nothing (not even telling them we can't).  I was certainly surprised
>>>> by this behavior.  Do you have a different proposal about how this should
>>>> be handled?
>>>>
>>>> On Tue, Apr 16, 2019 at 4:23 PM Ryan Blue <rblue@netflix.com> wrote:
>>>>
>>>>> Is this a bug fix? It looks like a new feature to me.
>>>>>
>>>>> On Tue, Apr 16, 2019 at 4:13 PM Michael Armbrust <
>>>>> michael@databricks.com> wrote:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>> I know we just released Spark 2.4.1, but in light of fixing
>>>>>> SPARK-27453 <https://issues.apache.org/jira/browse/SPARK-27453>
I
>>>>>> was wondering if it might make sense to follow up quickly with 2.4.2.
>>>>>> Without this fix its very hard to build a datasource that correctly
handles
>>>>>> partitioning without using unstable APIs.  There are also a few other
fixes
>>>>>> that have trickled in since 2.4.1.
>>>>>>
>>>>>> If there are no objections, I'd like to start the process shortly.
>>>>>>
>>>>>> Michael
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Software Engineer
>>>>> Netflix
>>>>>
>>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>
> --
> [image:
> https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]
>


-- 
Ryan Blue
Software Engineer
Netflix

Mime
View raw message