spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Artemis User <arte...@dtechspace.com>
Subject Re: spark-submit not running on macbook pro
Date Thu, 19 Aug 2021 13:08:23 GMT
Looks like PySpark can't initiate a JVM in the backend.  How did you set 
up Java and Spark on your machine?  Some suggestions that may help solve 
your issue:

 1. Use OpenJDK instead of Apple JDK since Spark was developed using
    OpenJDK, not Apple's.  You can use homebrew to install OpenJDK (I
    don't see any reasons why you need to use Apple's JDK unless you are
    using the latest Mac.  See question below)
 2. Download and deploy the Spark tarball directly from Spark's web site
    and run Spark's examples to test your environment using command line
    before integrating with PyCharm

My question to the group:  Does anyone have any luck with Apple's JDK 
when running Spark or other applications (performance-wise)? Is this the 
one with native libs for the M1 chipset?

-- ND


On 8/17/21 1:56 AM, karan alang wrote:
>
> Hello Experts,
>
> i'm trying to run spark-submit on my macbook pro(commandline or using 
> PyCharm), and it seems to be giving error ->
>
> Exception: Java gateway process exited before sending its port number
>
> i've tried setting values to variable in the program (based on the 
> recommendations by people on the internet), but the problem still remains.
>
> Any pointers on how to resolve this issue?
>
> # explicitly setting environment variables
> os.environ["JAVA_HOME"] = 
> "/Library/Java/JavaVirtualMachines/applejdk-11.0.7.10.1.jdk/Contents/Home"
> os.environ["PYTHONPATH"] = 
> "/usr/local/Cellar/apache-spark/3.1.2/libexec//python/lib/py4j-0.10.4-src.zip:/usr/local/Cellar/apache-spark/3.1.2/libexec//python/:"
> os.environ["PYSPARK_SUBMIT_ARGS"]="--master local[2] pyspark-shell"
>
> Traceback (most recent call last):
>   File "<input>", line 1, in <module>
>   File "/Applications/PyCharm 
> CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_umd.py", 
> line 198, in runfile
>     pydev_imports.execfile(filename, global_vars, local_vars)  # 
> execute the script
>   File "/Applications/PyCharm 
> CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", 
> line 18, in execfile
>     exec(compile(contents+"\n", file, 'exec'), glob, loc)
>   File 
> "/Users/karanalang/Documents/Technology/StructuredStreamin_Udemy/Spark-Streaming-In-Python-master/00-HelloSparkSQL/HelloSparkSQL.py",

> line 12, in <module>
>     spark = SparkSession.builder.master("local[*]").getOrCreate()
>   File 
> "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/sql/session.py",

> line 228, in getOrCreate
>     sc = SparkContext.getOrCreate(sparkConf)
>   File 
> "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/context.py",

> line 384, in getOrCreate
>     SparkContext(conf=conf or SparkConf())
>   File 
> "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/context.py",

> line 144, in __init__
>     SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
>   File 
> "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/context.py",

> line 331, in _ensure_initialized
>     SparkContext._gateway = gateway or launch_gateway(conf)
>   File 
> "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/java_gateway.py",

> line 108, in launch_gateway
>     raise Exception("Java gateway process exited before sending its 
> port number")
> Exception: Java gateway process exited before sending its port number


Mime
View raw message