spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dong Joon Hyun <dh...@hortonworks.com>
Subject Re: spark pypy support?
Date Mon, 14 Aug 2017 21:06:19 GMT
Hi, Tom.

What version of PyPy do you use?

In the Jenkins environment, `pypy` always passes like Python 2.7 and Python 3.4.

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/3340/consoleFull

========================================================================
Running PySpark tests
========================================================================
Running PySpark tests. Output is in /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.7/python/unit-tests.log
Will test against the following Python executables: ['python2.7', 'python3.4', 'pypy']
Will test the following Python modules: ['pyspark-core', 'pyspark-ml', 'pyspark-mllib', 'pyspark-sql',
'pyspark-streaming']
Starting test(python2.7): pyspark.mllib.tests
Starting test(pypy): pyspark.sql.tests
Starting test(pypy): pyspark.tests
Starting test(pypy): pyspark.streaming.tests
Finished test(pypy): pyspark.tests (181s)
…

Tests passed in 1130 seconds


Bests,
Dongjoon.


From: Tom Graves <tgraves_cs@yahoo.com.INVALID>
Date: Monday, August 14, 2017 at 1:55 PM
To: "dev@spark.apache.org" <dev@spark.apache.org>
Subject: spark pypy support?

Anyone know if pypy works with spark. Saw a jira that it was supported back in Spark 1.2 but
getting an error when trying and not sure if its something with my pypy version of just something
spark doesn't support.


AttributeError: 'builtin-code' object has no attribute 'co_filename'
Traceback (most recent call last):
  File "<builtin>/app_main.py", line 75, in run_toplevel
  File "/homes/tgraves/mbe.py", line 40, in <module>
    count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py", line 834, in reduce
    vals = self.mapPartitions(func).collect()
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py", line 808, in collect
    port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py", line 2440, in _jrdd
    self._jrdd_deserializer, profiler)
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py", line 2373, in _wrap_function
    pickled_command, broadcast_vars, env, includes = _prepare_for_python_RDD(sc, command)
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/rdd.py", line 2359, in _prepare_for_python_RDD
    pickled_command = ser.dumps(command)
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/serializers.py", line 460, in
dumps
    return cloudpickle.dumps(obj, 2)
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 703, in
dumps
    cp.dump(obj)
  File "/home/gs/spark/latest/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 160, in
dump

Thanks,
Tom
Mime
View raw message