spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungtaek Lim <kabh...@gmail.com>
Subject Re: [VOTE] Release Apache Spark 2.4.2
Date Tue, 30 Apr 2019 05:53:05 GMT
Just to be clear, does upgrading jackson to 2.9.8 be coupled with Scala
version? And could you summarize one of actual broken case due to upgrade
if you observe anything? Providing actual case would help us to weigh the
impact.

Btw, my 2 cents, personally I would rather avoid upgrading dependencies in
bugfix release unless it resolves major bugs, so reverting it from only
branch-2.4 sounds good to me. (I still think jackson upgrade is necessary
in master branch, avoiding lots of CVEs we will waste huge amount of time
to identify the impact. And other libs will start making couple with
jackson 2.9.x which conflict Spark's jackson dependency.)

If there will be a consensus regarding reverting that, we may also need to
announce Spark 2.4.2 is discouraged to be used, otherwise end users will
suffer from jackson version back and forth.

Thanks,
Jungtaek Lim (HeartSaVioR)

On Tue, Apr 30, 2019 at 2:30 PM Xiao Li <lixiao@databricks.com> wrote:

> Before cutting 2.4.3, I just submitted a PR
> https://github.com/apache/spark/pull/24493 for reverting the commit
> https://github.com/apache/spark/commit/6f394a20bf49f67b4d6329a1c25171c8024a2fae
> .
>
> In general, we need to be very cautious about the Jackson upgrade in the
> patch releases, especially when this upgrade could break the existing
> behaviors of the external packages or data sources, and generate different
> results after the upgrade. The external packages and data sources need to
> change their source code to keep the original behaviors. The upgrade
> requires more discussions before releasing it, I think.
>
> In the previous PR https://github.com/apache/spark/pull/22071, we turned
> off `spark.master.rest.enabled` by default and added the following claim in
> our security doc:
>
>> The Rest Submission Server and the MesosClusterDispatcher do not support
>> authentication.  You should ensure that all network access to the REST API
>> & MesosClusterDispatcher (port 6066 and 7077 respectively by default) are
>> restricted to hosts that are trusted to submit jobs.
>
>
> We need to understand whether this Jackson CVE applies to Spark. Before
> officially releasing the Jackson upgrade, we need more inputs from all of
> you. Currently, I would suggest to revert this upgrade from the upcoming
> 2.4.3 release, which is for fixing the accidental default Scala version
> changes in pre-built artifacts.
>
> Xiao
>
> On Mon, Apr 29, 2019 at 8:51 PM Dongjoon Hyun <dongjoon.hyun@gmail.com>
> wrote:
>
>> Hi, All and Xiao (as a next release manager).
>>
>> In any case, can the release manager include the information about the
>> used release script as a part of VOTE email officially?
>>
>> That information will be very helpful to reproduce Spark build (in the
>> downstream environment)
>>
>> Currently, it's not clearly which release script is used because the
>> master branch is also changed time to time during multiple RCs.
>>
>> We only guess some githash based on the RC start time.
>>
>> Bests,
>> Dongjoon.
>>
>> On Mon, Apr 29, 2019 at 7:17 PM Wenchen Fan <cloud0fan@gmail.com> wrote:
>>
>>> >  it could just be fixed in master rather than back-port and re-roll
>>> the RC
>>>
>>> I don't think the release script is part of the released product. That
>>> said, we can just fix the release script in branch 2.4 without creating a
>>> new RC. We can even create a new repo for the release script, like
>>> spark-website, to make it clearer.
>>>
>>> On Tue, Apr 30, 2019 at 7:22 AM Sean Owen <srowen@gmail.com> wrote:
>>>
>>>> I think this is a reasonable idea; I know @vanzin had suggested it was
>>>> simpler to use the latest in case a bug was found in the release script and
>>>> then it could just be fixed in master rather than back-port and re-roll the
>>>> RC. That said I think we did / had to already drop the ability to build <=
>>>> 2.3 from the master release script already.
>>>>
>>>> On Sun, Apr 28, 2019 at 9:25 PM Wenchen Fan <cloud0fan@gmail.com>
>>>> wrote:
>>>>
>>>>> >  ... by using the release script of Spark 2.4 branch
>>>>>
>>>>> Shall we keep it as a policy? Previously we used the release script
>>>>> from the master branch to do the release work for all Spark versions,
now I
>>>>> feel it's simpler and less error-prone to let the release script only
>>>>> handle one branch. We don't keep many branches as active at the same
time,
>>>>> so the maintenance overhead for the release script should be OK.
>>>>>
>>>>>>
>>>>>>
>
> --
> [image:
> https://databricks.com/sparkaisummit/north-america?utm_source=email&utm_medium=signature]
>


-- 
Name : Jungtaek Lim
Blog : http://medium.com/@heartsavior
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Mime
View raw message