spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyukjin Kwon <gurwls...@gmail.com>
Subject Re: how to set up pyspark eclipse, pyDev, virtualenv? syntaxError: yield from walk(
Date Fri, 06 Apr 2018 01:09:47 GMT
FYI, there is a PR and JIRA for virtualEnv support in PySpark

https://issues.apache.org/jira/browse/SPARK-13587
https://github.com/apache/spark/pull/13599


2018-04-06 7:48 GMT+08:00 Andy Davidson <Andy@santacruzintegration.com>:

> FYI
>
> http://www.learn4master.com/algorithms/pyspark-unit-test-
> set-up-sparkcontext
>
> From: Andrew Davidson <Andy@SantaCruzIntegration.com>
> Date: Wednesday, April 4, 2018 at 5:36 PM
> To: "user @spark" <user@spark.apache.org>
> Subject: how to set up pyspark eclipse, pyDev, virtualenv? syntaxError:
> yield from walk(
>
> I am having a heck of a time setting up my development environment. I used
> pip to install pyspark. I also downloaded spark from apache.
>
> My eclipse pyDev intereperter is configured as a python3 virtualenv
>
> I have a simple unit test that loads a small dataframe. Df.show()
> generates the following error
>
>
> 2018-04-04 17:13:56 ERROR Executor:91 - Exception in task 0.0 in stage 0.0
> (TID 0)
>
> org.apache.spark.SparkException:
>
> Error from python worker:
>
>   Traceback (most recent call last):
>
>     File "/Users/a/workSpace/pythonEnv/spark-2.3.0/lib/python3.6/site.py",
> line 67, in <module>
>
>       import os
>
>     File "/Users/a/workSpace/pythonEnv/spark-2.3.0/lib/python3.6/os.py",
> line 409
>
>       yield from walk(new_path, topdown, onerror, followlinks)
>
>                ^
>
>   SyntaxError: invalid syntax
>
>
>
> My unittest classs is dervied from.
>
>
> class PySparkTestCase(unittest.TestCase):
>
>
>     @classmethod
>
>     def setUpClass(cls):
>
>         conf = SparkConf().setMaster("local[2]") \
>
>             .setAppName(cls.__name__) #\
>
> #             .set("spark.authenticate.secret", "111111")
>
>         cls.sparkContext = SparkContext(conf=conf)
>
>         sc_values[cls.__name__] = cls.sparkContext
>
>         cls.sqlContext = SQLContext(cls.sparkContext)
>
>         print("aedwip:", SparkContext)
>
>
>     @classmethod
>
>     def tearDownClass(cls):
>
>         print("....calling stop tearDownClas, the content of sc_values=",
> sc_values)
>
>         sc_values.clear()
>
>         cls.sparkContext.stop()
>
>
> This looks similar to Class  PySparkTestCase in https://github.com/apache/
> spark/blob/master/python/pyspark/tests.py
>
>
> Any suggestions would be greatly appreciated.
>
>
> Andy
>
>
> My downloaed version is spark-2.3.0-bin-hadoop2.7
>
>
> My virtual env version is
>
> (spark-2.3.0) $ pip show pySpark
>
> Name: pyspark
>
> Version: 2.3.0
>
> Summary: Apache Spark Python API
>
> Home-page: https://github.com/apache/spark/tree/master/python
>
> Author: Spark Developers
>
> Author-email: dev@spark.apache.org
>
> License: http://www.apache.org/licenses/LICENSE-2.0
>
> Location: /Users/a/workSpace/pythonEnv/spark-2.3.0/lib/python3.6/
> site-packages
>
> Requires: py4j
>
> (spark-2.3.0) $
>
>
> (spark-2.3.0) $ python --version
>
> Python 3.6.1
>
> (spark-2.3.0) $
>
>
>

Mime
View raw message