spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <felixcheun...@hotmail.com>
Subject Re: Should python-2 be supported in Spark 3.0?
Date Sat, 01 Jun 2019 05:54:08 GMT
Very subtle but someone might take

“We will drop Python 2 support in a future release in 2020”

To mean any / first release in 2020. Whereas the next statement indicates patch release is
not included in above. Might help reorder the items or clarify the wording.


________________________________
From: shane knapp <sknapp@berkeley.edu>
Sent: Friday, May 31, 2019 7:38:10 PM
To: Denny Lee
Cc: Holden Karau; Bryan Cutler; Erik Erlandson; Felix Cheung; Mark Hamstra; Matei Zaharia;
Reynold Xin; Sean Owen; Wenchen Fen; Xiangrui Meng; dev; user
Subject: Re: Should python-2 be supported in Spark 3.0?

+1000  ;)

On Sat, Jun 1, 2019 at 6:53 AM Denny Lee <denny.g.lee@gmail.com<mailto:denny.g.lee@gmail.com>>
wrote:
+1

On Fri, May 31, 2019 at 17:58 Holden Karau <holden@pigscanfly.ca<mailto:holden@pigscanfly.ca>>
wrote:
+1

On Fri, May 31, 2019 at 5:41 PM Bryan Cutler <cutlerb@gmail.com<mailto:cutlerb@gmail.com>>
wrote:
+1 and the draft sounds good

On Thu, May 30, 2019, 11:32 AM Xiangrui Meng <mengxr@gmail.com<mailto:mengxr@gmail.com>>
wrote:
Here is the draft announcement:

===
Plan for dropping Python 2 support

As many of you already knew, Python core development team and many utilized Python packages
like Pandas and NumPy will drop Python 2 support in or before 2020/01/01. Apache Spark has
supported both Python 2 and 3 since Spark 1.4 release in 2015. However, maintaining Python
2/3 compatibility is an increasing burden and it essentially limits the use of Python 3 features
in Spark. Given the end of life (EOL) of Python 2 is coming, we plan to eventually drop Python
2 support as well. The current plan is as follows:

* In the next major release in 2019, we will deprecate Python 2 support. PySpark users will
see a deprecation warning if Python 2 is used. We will publish a migration guide for PySpark
users to migrate to Python 3.
* We will drop Python 2 support in a future release in 2020, after Python 2 EOL on 2020/01/01.
PySpark users will see an error if Python 2 is used.
* For releases that support Python 2, e.g., Spark 2.4, their patch releases will continue
supporting Python 2. However, after Python 2 EOL, we might not take patches that are specific
to Python 2.
===

Sean helped make a pass. If it looks good, I'm going to upload it to Spark website and announce
it here. Let me know if you think we should do a VOTE instead.

On Thu, May 30, 2019 at 9:21 AM Xiangrui Meng <mengxr@gmail.com<mailto:mengxr@gmail.com>>
wrote:
I created https://issues.apache.org/jira/browse/SPARK-27884 to track the work.

On Thu, May 30, 2019 at 2:18 AM Felix Cheung <felixcheung_m@hotmail.com<mailto:felixcheung_m@hotmail.com>>
wrote:
We don’t usually reference a future release on website

> Spark website and state that Python 2 is deprecated in Spark 3.0

I suspect people will then ask when is Spark 3.0 coming out then. Might need to provide some
clarity on that.

We can say the "next major release in 2019" instead of Spark 3.0. Spark 3.0 timeline certainly
requires a new thread to discuss.



________________________________
From: Reynold Xin <rxin@databricks.com<mailto:rxin@databricks.com>>
Sent: Thursday, May 30, 2019 12:59:14 AM
To: shane knapp
Cc: Erik Erlandson; Mark Hamstra; Matei Zaharia; Sean Owen; Wenchen Fen; Xiangrui Meng; dev;
user
Subject: Re: Should python-2 be supported in Spark 3.0?

+1 on Xiangrui’s plan.

On Thu, May 30, 2019 at 7:55 AM shane knapp <sknapp@berkeley.edu<mailto:sknapp@berkeley.edu>>
wrote:
I don't have a good sense of the overhead of continuing to support
Python 2; is it large enough to consider dropping it in Spark 3.0?

from the build/test side, it will actually be pretty easy to continue support for python2.7
for spark 2.x as the feature sets won't be expanding.

that being said, i will be cracking a bottle of champagne when i can delete all of the ansible
and anaconda configs for python2.x.  :)

On the development side, in a future release that drops Python 2 support we can remove code
that maintains python 2/3 compatibility and start using python 3 only features, which is also
quite exciting.


shane
--
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


--
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


--
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Mime
View raw message