spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyukjin Kwon <gurwls...@gmail.com>
Subject Re: [DISCUSS] Drop Python 2, 3.4 and 3.5
Date Tue, 14 Jul 2020 02:25:15 GMT
Thank you all. Python 2, 3.4 and 3.5 are dropped now in the master branch
at https://github.com/apache/spark/pull/28957

2020년 7월 3일 (금) 오전 10:01, Hyukjin Kwon <gurwls223@gmail.com>님이 작성:

> Thanks Dongjoon. That makes much more sense now!
>
> 2020년 7월 3일 (금) 오전 12:11, Dongjoon Hyun <dongjoon.hyun@gmail.com>님이
작성:
>
>> Thank you, Hyukjin.
>>
>> According to the Python community, Python 3.5 is also EOF at 2020-09-13
>> (only two months left).
>>
>> - https://www.python.org/downloads/
>>
>> So, targeting live Python versions at Apache Spark 3.1.0 (December 2020)
>> looks reasonable to me.
>>
>> For old Python versions, we still have Apache Spark 2.4 LTS and also
>> Apache Spark 3.0.x will work.
>>
>> Bests,
>> Dongjoon.
>>
>>
>> On Wed, Jul 1, 2020 at 10:50 PM Yuanjian Li <xyliyuanjian@gmail.com>
>> wrote:
>>
>>> +1, especially Python 2
>>>
>>> Holden Karau <holden@pigscanfly.ca> 于2020年7月2日周四 上午10:20写道:
>>>
>>>> I’m ok with us dropping Python 2, 3.4, and 3.5 in Spark 3.1 forward. It
>>>> will be exciting to get to use more recent Python features. The most recent
>>>> Ubuntu LTS ships with 3.7, and while the previous LTS ships with 3.5, if
>>>> folks really can’t upgrade there’s conda.
>>>>
>>>> Is there anyone with a large Python 3.5 fleet who can’t use conda?
>>>>
>>>> On Wed, Jul 1, 2020 at 7:15 PM Hyukjin Kwon <gurwls223@gmail.com>
>>>> wrote:
>>>>
>>>>> Yeah, sure. It will be dropped at Spark 3.1 onwards. I don't think we
>>>>> should make such changes in maintenance releases
>>>>>
>>>>> 2020년 7월 2일 (목) 오전 11:13, Holden Karau <holden@pigscanfly.ca>님이
작성:
>>>>>
>>>>>> To be clear the plan is to drop them in Spark 3.1 onwards, yes?
>>>>>>
>>>>>> On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon <gurwls223@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I would like to discuss dropping deprecated Python versions 2,
3.4
>>>>>>> and 3.5 at https://github.com/apache/spark/pull/28957. I assume
>>>>>>> people support it in general
>>>>>>> but I am writing this to make sure everybody is happy.
>>>>>>>
>>>>>>> Fokko made a very good investigation on it, see
>>>>>>> https://github.com/apache/spark/pull/28957#issuecomment-652022449.
>>>>>>> Assuming from the statistics, I think we're pretty safe to drop
them.
>>>>>>> Also note that dropping Python 2 was actually declared at
>>>>>>> https://python3statement.org/
>>>>>>>
>>>>>>> Roughly speaking, there are many main advantages by dropping
them:
>>>>>>>   1. It removes a bunch of hacks we added around 700 lines in
>>>>>>> PySpark.
>>>>>>>   2. PyPy2 has a critical bug that causes a flaky test,
>>>>>>> https://issues.apache.org/jira/browse/SPARK-28358 given my testing
>>>>>>> and investigation.
>>>>>>>   3. Users can use Python type hints with Pandas UDFs without
>>>>>>> thinking about Python version
>>>>>>>   4. Users can leverage one latest cloudpickle,
>>>>>>> https://github.com/apache/spark/pull/28950. With Python 3.8+
it can
>>>>>>> also leverage C pickle.
>>>>>>>   5. ...
>>>>>>>
>>>>>>> So it benefits both users and dev. WDYT guys?
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>
>>>>> --
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>
>>>

Mime
View raw message