spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiao Li <lix...@databricks.com>
Subject Re: [VOTE] Release Apache Spark 2.4.2
Date Tue, 30 Apr 2019 06:04:00 GMT
Jungtaek,

Thanks for your inputs! Sorry for the confusion. Let me make it clear.

   - All the previous 2.4.x [including 2.4.2] releases are using Jackson
   2.6.7.1.
   - In the master branch, the Jackson is already upgraded to 2.9.8.
   - Here, I just try to revert Jackson upgrade in the upcoming 2.4.3
   release.

Cheers,

Xiao

On Mon, Apr 29, 2019 at 10:53 PM Jungtaek Lim <kabhwan@gmail.com> wrote:

> Just to be clear, does upgrading jackson to 2.9.8 be coupled with Scala
> version? And could you summarize one of actual broken case due to upgrade
> if you observe anything? Providing actual case would help us to weigh the
> impact.
>
> Btw, my 2 cents, personally I would rather avoid upgrading dependencies in
> bugfix release unless it resolves major bugs, so reverting it from only
> branch-2.4 sounds good to me. (I still think jackson upgrade is necessary
> in master branch, avoiding lots of CVEs we will waste huge amount of time
> to identify the impact. And other libs will start making couple with
> jackson 2.9.x which conflict Spark's jackson dependency.)
>
> If there will be a consensus regarding reverting that, we may also need to
> announce Spark 2.4.2 is discouraged to be used, otherwise end users will
> suffer from jackson version back and forth.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> On Tue, Apr 30, 2019 at 2:30 PM Xiao Li <lixiao@databricks.com> wrote:
>
>> Before cutting 2.4.3, I just submitted a PR
>> https://github.com/apache/spark/pull/24493 for reverting the commit
>> https://github.com/apache/spark/commit/6f394a20bf49f67b4d6329a1c25171c8024a2fae
>> .
>>
>> In general, we need to be very cautious about the Jackson upgrade in the
>> patch releases, especially when this upgrade could break the existing
>> behaviors of the external packages or data sources, and generate different
>> results after the upgrade. The external packages and data sources need to
>> change their source code to keep the original behaviors. The upgrade
>> requires more discussions before releasing it, I think.
>>
>> In the previous PR https://github.com/apache/spark/pull/22071, we turned
>> off `spark.master.rest.enabled` by default and added the following claim in
>> our security doc:
>>
>>> The Rest Submission Server and the MesosClusterDispatcher do not support
>>> authentication.  You should ensure that all network access to the REST API
>>> & MesosClusterDispatcher (port 6066 and 7077 respectively by default) are
>>> restricted to hosts that are trusted to submit jobs.
>>
>>
>> We need to understand whether this Jackson CVE applies to Spark. Before
>> officially releasing the Jackson upgrade, we need more inputs from all of
>> you. Currently, I would suggest to revert this upgrade from the upcoming
>> 2.4.3 release, which is for fixing the accidental default Scala version
>> changes in pre-built artifacts.
>>
>> Xiao
>>
>> On Mon, Apr 29, 2019 at 8:51 PM Dongjoon Hyun <dongjoon.hyun@gmail.com>
>> wrote:
>>
>>> Hi, All and Xiao (as a next release manager).
>>>
>>> In any case, can the release manager include the information about the
>>> used release script as a part of VOTE email officially?
>>>
>>> That information will be very helpful to reproduce Spark build (in the
>>> downstream environment)
>>>
>>> Currently, it's not clearly which release script is used because the
>>> master branch is also changed time to time during multiple RCs.
>>>
>>> We only guess some githash based on the RC start time.
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Mon, Apr 29, 2019 at 7:17 PM Wenchen Fan <cloud0fan@gmail.com> wrote:
>>>
>>>> >  it could just be fixed in master rather than back-port and re-roll
>>>> the RC
>>>>
>>>> I don't think the release script is part of the released product. That
>>>> said, we can just fix the release script in branch 2.4 without creating a
>>>> new RC. We can even create a new repo for the release script, like
>>>> spark-website, to make it clearer.
>>>>
>>>> On Tue, Apr 30, 2019 at 7:22 AM Sean Owen <srowen@gmail.com> wrote:
>>>>
>>>>> I think this is a reasonable idea; I know @vanzin had suggested it was
>>>>> simpler to use the latest in case a bug was found in the release script
and
>>>>> then it could just be fixed in master rather than back-port and re-roll
the
>>>>> RC. That said I think we did / had to already drop the ability to build
<=
>>>>> 2.3 from the master release script already.
>>>>>
>>>>> On Sun, Apr 28, 2019 at 9:25 PM Wenchen Fan <cloud0fan@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> >  ... by using the release script of Spark 2.4 branch
>>>>>>
>>>>>> Shall we keep it as a policy? Previously we used the release script
>>>>>> from the master branch to do the release work for all Spark versions,
now I
>>>>>> feel it's simpler and less error-prone to let the release script
only
>>>>>> handle one branch. We don't keep many branches as active at the same
time,
>>>>>> so the maintenance overhead for the release script should be OK.
>>>>>>
>>>>>>>
>>>>>>>
>>
>> --
>> [image:
>> https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]
>>
>
>
> --
> Name : Jungtaek Lim
> Blog : http://medium.com/@heartsavior
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>


-- 
[image:
https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]

Mime
View raw message