spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Rosen <rosenvi...@gmail.com>
Subject Python 3 support for PySpark has been merged into master
Date Thu, 16 Apr 2015 23:46:26 GMT
Hi everyone,

We just merged Python 3 support for PySpark into Spark's master branch
(which will become Spark 1.4.0).  This means that PySpark now supports
Python 2.6+, PyPy 2.5+, and Python 3.4+.

To run with Python 3, download and build Spark from the master branch then
configure the PYSPARK_PYTHON environment variable to point to a Python 3.4
executable.  For example:

PYSPARK_PYTHON=python3.4 ./bin/pyspark


For more details on this feature, see the pull request and JIRA:

- https://github.com/apache/spark/pull/5173
- https://issues.apache.org/jira/browse/SPARK-4897

For Spark contributors, this change means that any open PySpark pull
requests are now likely to have merge conflicts.  If a pull request does
not have merge conflicts, we should still re-test it with Jenkins to check
that it still works under Python 3.  When backporting Python patches,
committers may wish to run the PySpark unit tests locally to make sure that
the change still work correctly in older branches.  I can also help with
backports / fixing conflicts.

Thanks to Davies Liu, Shane Knapp, Thom Neale, Xiangrui Meng, and everyone
else who helped with this patch.

- Josh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message