spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <rb...@netflix.com.INVALID>
Subject Re: Spark 2.4.2
Date Wed, 17 Apr 2019 16:34:56 GMT
How is this a bug fix?

On Wed, Apr 17, 2019 at 9:30 AM Xiao Li <lixiao@databricks.com> wrote:

> Michael and I had an offline discussion about this PR
> https://github.com/apache/spark/pull/24365. He convinced me that this is
> a bug fix. The code changes of this bug fix are very tiny and the risk is
> very low. To avoid any behavior change in the patch releases, this PR also
> added a legacy flag whose default value is off.
>
>
>
>
> On Wed, Apr 17, 2019 at 8:55 AM Ryan Blue <rblue@netflix.com> wrote:
>
>> Xiao, you're okay with deciding on inclusion in a patch release
>> case-by-case? In the past, you've vehemently advocated that absolutely no
>> patches that are not strictly bug fixes should go in. I think it is a good
>> step to accept this change, but I want to make sure you're conscious that
>> this clearly sets a new precedent.
>>
>> On Tue, Apr 16, 2019 at 5:14 PM Xiao Li <lixiao@databricks.com> wrote:
>>
>>> We also found two regressions that were introduced in the maintenance
>>> release 2.4.1. Below are the fixes that were recently merged:
>>> - https://github.com/apache/spark/pull/24359
>>> - https://github.com/apache/spark/pull/24329
>>>
>>>
>>> On Tue, Apr 16, 2019 at 5:04 PM Michael Armbrust <michael@databricks.com>
>>> wrote:
>>>
>>>> Thanks Ryan. To me the "test" for putting things in a maintenance
>>>> release is really a trade-off between benefit and risk (along with some
>>>> caveats, like user facing surface should not grow). The benefits here are
>>>> fairly large (now it is possible to plug in partition aware data sources)
>>>> and the risk is very low (no change in behavior by default).
>>>>
>>>> And bugs aren't usually fixed with a configuration flag to turn on the
>>>>> fix.
>>>>
>>>>
>>>> Agree, this should be on by default in master. That would just tip the
>>>> risk balance for me in a maintenance release.
>>>>
>>>> On Tue, Apr 16, 2019 at 4:55 PM Ryan Blue <rblue@netflix.com> wrote:
>>>>
>>>>> Spark has a lot of strange behaviors already that we don't fix in
>>>>> patch releases. And bugs aren't usually fixed with a configuration flag
to
>>>>> turn on the fix.
>>>>>
>>>>> That said, I don't have a problem with this commit making it into a
>>>>> patch release. This is a small change and looks safe enough to me. I
was
>>>>> just a little surprised since I was expecting a correctness issue if
this
>>>>> is prompting a release. I'm definitely on the side of case-by-case
>>>>> judgments on what to allow in patch releases and this looks fine.
>>>>>
>>>>> On Tue, Apr 16, 2019 at 4:27 PM Michael Armbrust <
>>>>> michael@databricks.com> wrote:
>>>>>
>>>>>> I would argue that its confusing enough to a user for options from
>>>>>> DataFrameWriter to be silently dropped when instantiating the data
source
>>>>>> to consider this a bug.  They asked for partitioning to occur, and
we are
>>>>>> doing nothing (not even telling them we can't).  I was certainly
surprised
>>>>>> by this behavior.  Do you have a different proposal about how this
should
>>>>>> be handled?
>>>>>>
>>>>>> On Tue, Apr 16, 2019 at 4:23 PM Ryan Blue <rblue@netflix.com>
wrote:
>>>>>>
>>>>>>> Is this a bug fix? It looks like a new feature to me.
>>>>>>>
>>>>>>> On Tue, Apr 16, 2019 at 4:13 PM Michael Armbrust <
>>>>>>> michael@databricks.com> wrote:
>>>>>>>
>>>>>>>> Hello All,
>>>>>>>>
>>>>>>>> I know we just released Spark 2.4.1, but in light of fixing
>>>>>>>> SPARK-27453 <https://issues.apache.org/jira/browse/SPARK-27453>
I
>>>>>>>> was wondering if it might make sense to follow up quickly
with 2.4.2.
>>>>>>>> Without this fix its very hard to build a datasource that
correctly handles
>>>>>>>> partitioning without using unstable APIs.  There are also
a few other fixes
>>>>>>>> that have trickled in since 2.4.1.
>>>>>>>>
>>>>>>>> If there are no objections, I'd like to start the process
shortly.
>>>>>>>>
>>>>>>>> Michael
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>> Software Engineer
>>>>>>> Netflix
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Software Engineer
>>>>> Netflix
>>>>>
>>>>
>>>
>>> --
>>> [image:
>>> https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>
>
> --
> [image:
> https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]
>


-- 
Ryan Blue
Software Engineer
Netflix

Mime
View raw message