spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Reynold Xin" <r...@databricks.com>
Subject Re: [VOTE] Release Apache Spark 2.4.2
Date Tue, 30 Apr 2019 06:34:49 GMT
Echoing both of you ... it's a bit risky to bump dependency versions in a patch release, especially
for a super common library. (I wish we shaded Jackson).

Maybe the CVE is a sufficient reason to bump the dependency, ignoring the potential behavior
changes that might happen, but I'd like to see a bit more discussions there and have 2.4.3
focusing on fixing the Scala version issue first.

On Mon, Apr 29, 2019 at 11:17 PM, Jungtaek Lim < kabhwan@gmail.com > wrote:

> 
> Ah! Sorry Xiao I should check the fix version of issue (it's 2.4.3/3.0.0).
> 
> 
> Then looks much better to revert and avoid dependency conflict in bugfix
> release. Jackson is one of known things making non-backward changes to
> non-major version, so I agree it's the thing to be careful, or
> shade/relocate and forget about it.
> 
> On Tue, Apr 30, 2019 at 3:04 PM Xiao Li < lixiao@ databricks. com (
> lixiao@databricks.com ) > wrote:
> 
> 
>> Jungtaek, 
>> 
>> 
>> Thanks for your inputs! Sorry for the confusion. Let me make it clear. 
>> 
>> 
>> * All the previous 2.4.x [including 2.4.2] releases are using Jackson
>> 2.6.7.1. 
>> 
>> * In the master branch, the Jackson is already upgraded to 2.9.8.  
>> 
>> * Here, I just try to revert Jackson upgrade in the upcoming 2.4.3
>> release.
>> 
>> 
>> 
>> Cheers,
>> 
>> 
>> Xiao
>> 
>> On Mon, Apr 29, 2019 at 10:53 PM Jungtaek Lim < kabhwan@ gmail. com (
>> kabhwan@gmail.com ) > wrote:
>> 
>> 
>>> Just to be clear, does upgrading jackson to 2.9.8 be coupled with Scala
>>> version? And could you summarize one of actual broken case due to upgrade
>>> if you observe anything? Providing actual case would help us to weigh the
>>> impact.
>>> 
>>> 
>>> Btw, my 2 cents, personally I would rather avoid upgrading dependencies in
>>> bugfix release unless it resolves major bugs, so reverting it from only
>>> branch-2.4 sounds good to me. (I still think jackson upgrade is necessary
>>> in master branch, avoiding lots of CVEs we will waste huge amount of time
>>> to identify the impact. And other libs will start making couple with
>>> jackson 2.9.x which conflict Spark's jackson dependency.)
>>> 
>>> 
>>> If there will be a consensus regarding reverting that, we may also need to
>>> announce Spark 2.4.2 is discouraged to be used, otherwise end users will
>>> suffer from jackson version back and forth.
>>> 
>>> 
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>> 
>>> On Tue, Apr 30, 2019 at 2:30 PM Xiao Li < lixiao@ databricks. com (
>>> lixiao@databricks.com ) > wrote:
>>> 
>>> 
>>>> Before cutting 2.4.3, I just submitted a PR https:/ / github. com/ apache/
>>>> spark/ pull/ 24493 ( https://github.com/apache/spark/pull/24493 ) for
>>>> reverting the commit https:/ / github. com/ apache/ spark/ commit/ 6f394a20bf49f67b4d6329a1c25171c8024a2fae
>>>> (
>>>> https://github.com/apache/spark/commit/6f394a20bf49f67b4d6329a1c25171c8024a2fae
>>>> ).
>>>> 
>>>> 
>>>> In general, we need to be very cautious about the Jackson upgrade in the
>>>> patch releases, especially when this upgrade could break the existing
>>>> behaviors of the external packages or data sources, and generate different
>>>> results after the upgrade. The external packages and data sources need to
>>>> change their source code to keep the original behaviors. The upgrade
>>>> requires more discussions before releasing it, I think.
>>>> 
>>>> 
>>>> In the previous PR https:/ / github. com/ apache/ spark/ pull/ 22071 (
>>>> https://github.com/apache/spark/pull/22071 ) , we turned off ` spark. master.
>>>> rest. enabled ( http://spark.master.rest.enabled/ ) ` by default and added
>>>> the following claim in our security doc:
>>>> 
>>>>> The Rest Submission Server and the MesosClusterDispatcher do not support
>>>>> authentication.  You should ensure that all network access to the REST
API
>>>>> & MesosClusterDispatcher (port 6066 and 7077 respectively by default)
are
>>>>> restricted to hosts that are trusted to submit jobs.
>>>> 
>>>> 
>>>> 
>>>> We need to understand whether this Jackson CVE applies to Spark. Before
>>>> officially releasing the Jackson upgrade, we need more inputs from all of
>>>> you. Currently, I would suggest to revert this upgrade from the upcoming
>>>> 2.4.3 release, which is for fixing the accidental default Scala version
>>>> changes in pre-built artifacts. 
>>>> 
>>>> 
>>>> Xiao
>>>> 
>>>> On Mon, Apr 29, 2019 at 8:51 PM Dongjoon Hyun < dongjoon. hyun@ gmail.
com
>>>> ( dongjoon.hyun@gmail.com ) > wrote:
>>>> 
>>>> 
>>>>> Hi, All and Xiao (as a next release manager).
>>>>> 
>>>>> 
>>>>> In any case, can the release manager include the information about the
>>>>> used release script as a part of VOTE email officially?
>>>>> 
>>>>> 
>>>>> That information will be very helpful to reproduce Spark build (in the
>>>>> downstream environment)
>>>>> 
>>>>> 
>>>>> Currently, it's not clearly which release script is used because the
>>>>> master branch is also changed time to time during multiple RCs.
>>>>> 
>>>>> 
>>>>> We only guess some githash based on the RC start time.
>>>>> 
>>>>> 
>>>>> Bests,
>>>>> Dongjoon.
>>>>> 
>>>>> On Mon, Apr 29, 2019 at 7:17 PM Wenchen Fan < cloud0fan@ gmail. com
(
>>>>> cloud0fan@gmail.com ) > wrote:
>>>>> 
>>>>> 
>>>>>> >  it could just be fixed in master rather than back-port and
re-roll the
>>>>>> RC
>>>>>> 
>>>>>> 
>>>>>> I don't think the release script is part of the released product.
That
>>>>>> said, we can just fix the release script in branch 2.4 without creating
a
>>>>>> new RC. We can even create a new repo for the release script, like
>>>>>> spark-website, to make it clearer. 
>>>>>> 
>>>>>> On Tue, Apr 30, 2019 at 7:22 AM Sean Owen < srowen@ gmail. com
(
>>>>>> srowen@gmail.com ) > wrote:
>>>>>> 
>>>>>> 
>>>>>>> I think this is a reasonable idea; I know @vanzin had suggested
it was
>>>>>>> simpler to use the latest in case a bug was found in the release
script
>>>>>>> and then it could just be fixed in master rather than back-port
and
>>>>>>> re-roll the RC. That said I think we did / had to already drop
the ability
>>>>>>> to build <= 2.3 from the master release script already.
>>>>>>> 
>>>>>>> On Sun, Apr 28, 2019 at 9:25 PM Wenchen Fan < cloud0fan@ gmail.
com (
>>>>>>> cloud0fan@gmail.com ) > wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> >  ... by using the release script of Spark 2.4 branch
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Shall we keep it as a policy? Previously we used the release
script from
>>>>>>>> the master branch to do the release work for all Spark versions,
now I
>>>>>>>> feel it's simpler and less error-prone to let the release
script only
>>>>>>>> handle one branch. We don't keep many branches as active
at the same time,
>>>>>>>> so the maintenance overhead for the release script should
be OK.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Name : Jungtaek Lim
>>> Blog : http:/ / medium. com/ @ heartsavior ( http://medium.com/@heartsavior
>>> )
>>> Twitter : http:/ / twitter. com/ heartsavior (
>>> http://twitter.com/heartsavior )
>>> LinkedIn : http:/ / www. linkedin. com/ in/ heartsavior (
>>> http://www.linkedin.com/in/heartsavior )
>>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> 
>> 
> 
> 
> 
> 
> --
> Name : Jungtaek Lim
> Blog : http:/ / medium. com/ @ heartsavior ( http://medium.com/@heartsavior
> )
> Twitter : http:/ / twitter. com/ heartsavior (
> http://twitter.com/heartsavior )
> LinkedIn : http:/ / www. linkedin. com/ in/ heartsavior (
> http://www.linkedin.com/in/heartsavior )
>
Mime
View raw message