spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shane knapp <skn...@berkeley.edu>
Subject Re: Should python-2 be supported in Spark 3.0?
Date Sat, 01 Jun 2019 02:38:10 GMT
+1000  ;)

On Sat, Jun 1, 2019 at 6:53 AM Denny Lee <denny.g.lee@gmail.com> wrote:

> +1
>
> On Fri, May 31, 2019 at 17:58 Holden Karau <holden@pigscanfly.ca> wrote:
>
>> +1
>>
>> On Fri, May 31, 2019 at 5:41 PM Bryan Cutler <cutlerb@gmail.com> wrote:
>>
>>> +1 and the draft sounds good
>>>
>>> On Thu, May 30, 2019, 11:32 AM Xiangrui Meng <mengxr@gmail.com> wrote:
>>>
>>>> Here is the draft announcement:
>>>>
>>>> ===
>>>> Plan for dropping Python 2 support
>>>>
>>>> As many of you already knew, Python core development team and many
>>>> utilized Python packages like Pandas and NumPy will drop Python 2 support
>>>> in or before 2020/01/01. Apache Spark has supported both Python 2 and 3
>>>> since Spark 1.4 release in 2015. However, maintaining Python 2/3
>>>> compatibility is an increasing burden and it essentially limits the use of
>>>> Python 3 features in Spark. Given the end of life (EOL) of Python 2 is
>>>> coming, we plan to eventually drop Python 2 support as well. The current
>>>> plan is as follows:
>>>>
>>>> * In the next major release in 2019, we will deprecate Python 2
>>>> support. PySpark users will see a deprecation warning if Python 2 is used.
>>>> We will publish a migration guide for PySpark users to migrate to Python
3.
>>>> * We will drop Python 2 support in a future release in 2020, after
>>>> Python 2 EOL on 2020/01/01. PySpark users will see an error if Python 2 is
>>>> used.
>>>> * For releases that support Python 2, e.g., Spark 2.4, their patch
>>>> releases will continue supporting Python 2. However, after Python 2 EOL,
we
>>>> might not take patches that are specific to Python 2.
>>>> ===
>>>>
>>>> Sean helped make a pass. If it looks good, I'm going to upload it to
>>>> Spark website and announce it here. Let me know if you think we should do
a
>>>> VOTE instead.
>>>>
>>>> On Thu, May 30, 2019 at 9:21 AM Xiangrui Meng <mengxr@gmail.com> wrote:
>>>>
>>>>> I created https://issues.apache.org/jira/browse/SPARK-27884 to track
>>>>> the work.
>>>>>
>>>>> On Thu, May 30, 2019 at 2:18 AM Felix Cheung <
>>>>> felixcheung_m@hotmail.com> wrote:
>>>>>
>>>>>> We don’t usually reference a future release on website
>>>>>>
>>>>>> > Spark website and state that Python 2 is deprecated in Spark
3.0
>>>>>>
>>>>>> I suspect people will then ask when is Spark 3.0 coming out then.
>>>>>> Might need to provide some clarity on that.
>>>>>>
>>>>>
>>>>> We can say the "next major release in 2019" instead of Spark 3.0.
>>>>> Spark 3.0 timeline certainly requires a new thread to discuss.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>> *From:* Reynold Xin <rxin@databricks.com>
>>>>>> *Sent:* Thursday, May 30, 2019 12:59:14 AM
>>>>>> *To:* shane knapp
>>>>>> *Cc:* Erik Erlandson; Mark Hamstra; Matei Zaharia; Sean Owen;
>>>>>> Wenchen Fen; Xiangrui Meng; dev; user
>>>>>> *Subject:* Re: Should python-2 be supported in Spark 3.0?
>>>>>>
>>>>>> +1 on Xiangrui’s plan.
>>>>>>
>>>>>> On Thu, May 30, 2019 at 7:55 AM shane knapp <sknapp@berkeley.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> I don't have a good sense of the overhead of continuing to support
>>>>>>>> Python 2; is it large enough to consider dropping it in Spark
3.0?
>>>>>>>>
>>>>>>>> from the build/test side, it will actually be pretty easy
to
>>>>>>> continue support for python2.7 for spark 2.x as the feature sets
won't be
>>>>>>> expanding.
>>>>>>>
>>>>>>
>>>>>>> that being said, i will be cracking a bottle of champagne when
i can
>>>>>>> delete all of the ansible and anaconda configs for python2.x.
 :)
>>>>>>>
>>>>>>
>>>>> On the development side, in a future release that drops Python 2
>>>>> support we can remove code that maintains python 2/3 compatibility and
>>>>> start using python 3 only features, which is also quite exciting.
>>>>>
>>>>>
>>>>>>
>>>>>>> shane
>>>>>>> --
>>>>>>> Shane Knapp
>>>>>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>>>>>> https://rise.cs.berkeley.edu
>>>>>>>
>>>>>>
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Mime
View raw message