spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungtaek Lim <kabh...@gmail.com>
Subject Re: [VOTE] Release Apache Spark 2.4.2
Date Tue, 30 Apr 2019 06:17:13 GMT
Ah! Sorry Xiao I should check the fix version of issue (it's 2.4.3/3.0.0).

Then looks much better to revert and avoid dependency conflict in bugfix
release. Jackson is one of known things making non-backward changes to
non-major version, so I agree it's the thing to be careful, or
shade/relocate and forget about it.

On Tue, Apr 30, 2019 at 3:04 PM Xiao Li <lixiao@databricks.com> wrote:

> Jungtaek,
>
> Thanks for your inputs! Sorry for the confusion. Let me make it clear.
>
>    - All the previous 2.4.x [including 2.4.2] releases are using Jackson
>    2.6.7.1.
>    - In the master branch, the Jackson is already upgraded to 2.9.8.
>    - Here, I just try to revert Jackson upgrade in the upcoming 2.4.3
>    release.
>
> Cheers,
>
> Xiao
>
> On Mon, Apr 29, 2019 at 10:53 PM Jungtaek Lim <kabhwan@gmail.com> wrote:
>
>> Just to be clear, does upgrading jackson to 2.9.8 be coupled with Scala
>> version? And could you summarize one of actual broken case due to upgrade
>> if you observe anything? Providing actual case would help us to weigh the
>> impact.
>>
>> Btw, my 2 cents, personally I would rather avoid upgrading dependencies
>> in bugfix release unless it resolves major bugs, so reverting it from only
>> branch-2.4 sounds good to me. (I still think jackson upgrade is necessary
>> in master branch, avoiding lots of CVEs we will waste huge amount of time
>> to identify the impact. And other libs will start making couple with
>> jackson 2.9.x which conflict Spark's jackson dependency.)
>>
>> If there will be a consensus regarding reverting that, we may also need
>> to announce Spark 2.4.2 is discouraged to be used, otherwise end users will
>> suffer from jackson version back and forth.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> On Tue, Apr 30, 2019 at 2:30 PM Xiao Li <lixiao@databricks.com> wrote:
>>
>>> Before cutting 2.4.3, I just submitted a PR
>>> https://github.com/apache/spark/pull/24493 for reverting the commit
>>> https://github.com/apache/spark/commit/6f394a20bf49f67b4d6329a1c25171c8024a2fae
>>> .
>>>
>>> In general, we need to be very cautious about the Jackson upgrade in the
>>> patch releases, especially when this upgrade could break the existing
>>> behaviors of the external packages or data sources, and generate different
>>> results after the upgrade. The external packages and data sources need to
>>> change their source code to keep the original behaviors. The upgrade
>>> requires more discussions before releasing it, I think.
>>>
>>> In the previous PR https://github.com/apache/spark/pull/22071, we
>>> turned off `spark.master.rest.enabled` by default and added the following
>>> claim in our security doc:
>>>
>>>> The Rest Submission Server and the MesosClusterDispatcher do not
>>>> support authentication.  You should ensure that all network access to the
>>>> REST API & MesosClusterDispatcher (port 6066 and 7077 respectively by
>>>> default) are restricted to hosts that are trusted to submit jobs.
>>>
>>>
>>> We need to understand whether this Jackson CVE applies to Spark. Before
>>> officially releasing the Jackson upgrade, we need more inputs from all of
>>> you. Currently, I would suggest to revert this upgrade from the upcoming
>>> 2.4.3 release, which is for fixing the accidental default Scala version
>>> changes in pre-built artifacts.
>>>
>>> Xiao
>>>
>>> On Mon, Apr 29, 2019 at 8:51 PM Dongjoon Hyun <dongjoon.hyun@gmail.com>
>>> wrote:
>>>
>>>> Hi, All and Xiao (as a next release manager).
>>>>
>>>> In any case, can the release manager include the information about the
>>>> used release script as a part of VOTE email officially?
>>>>
>>>> That information will be very helpful to reproduce Spark build (in the
>>>> downstream environment)
>>>>
>>>> Currently, it's not clearly which release script is used because the
>>>> master branch is also changed time to time during multiple RCs.
>>>>
>>>> We only guess some githash based on the RC start time.
>>>>
>>>> Bests,
>>>> Dongjoon.
>>>>
>>>> On Mon, Apr 29, 2019 at 7:17 PM Wenchen Fan <cloud0fan@gmail.com>
>>>> wrote:
>>>>
>>>>> >  it could just be fixed in master rather than back-port and re-roll
>>>>> the RC
>>>>>
>>>>> I don't think the release script is part of the released product. That
>>>>> said, we can just fix the release script in branch 2.4 without creating
a
>>>>> new RC. We can even create a new repo for the release script, like
>>>>> spark-website, to make it clearer.
>>>>>
>>>>> On Tue, Apr 30, 2019 at 7:22 AM Sean Owen <srowen@gmail.com> wrote:
>>>>>
>>>>>> I think this is a reasonable idea; I know @vanzin had suggested it
>>>>>> was simpler to use the latest in case a bug was found in the release
script
>>>>>> and then it could just be fixed in master rather than back-port and
re-roll
>>>>>> the RC. That said I think we did / had to already drop the ability
to build
>>>>>> <= 2.3 from the master release script already.
>>>>>>
>>>>>> On Sun, Apr 28, 2019 at 9:25 PM Wenchen Fan <cloud0fan@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> >  ... by using the release script of Spark 2.4 branch
>>>>>>>
>>>>>>> Shall we keep it as a policy? Previously we used the release
script
>>>>>>> from the master branch to do the release work for all Spark versions,
now I
>>>>>>> feel it's simpler and less error-prone to let the release script
only
>>>>>>> handle one branch. We don't keep many branches as active at the
same time,
>>>>>>> so the maintenance overhead for the release script should be
OK.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>> --
>>> [image:
>>> https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]
>>>
>>
>>
>> --
>> Name : Jungtaek Lim
>> Blog : http://medium.com/@heartsavior
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>
>
>
> --
> [image:
> https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]
>


-- 
Name : Jungtaek Lim
Blog : http://medium.com/@heartsavior
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Mime
View raw message