spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From peteranolaN <peternolaninsec...@gmail.com>
Subject 1.5.2 prebuilt for 2.4 spark-submit standalone Python scripts not running
Date Sun, 27 Dec 2015 00:27:10 GMT
Hi all, 

Question from a newbie here about your excellent Spark:  

I've just installed Spark 1.5.2, pre-built for Hadoop 2.4 and later.  I'm
trying to go through the introductory documentation using local[4] to begin
with.  In pyspark, I'm able to use examples  such as the simple application
at PROVIDED that I remove the sc initialisation.  Otherwise, if I try to run
any Python script using spark-submit, I get the verbose error message I show
below and no output.  I am not able to fix this. 

Any assistance would be very gratefully received.  

My machine runs Windows 10 HOME, with 8GB ram on a 64 bit Intel Core i3-@
3.4 gHz.  I'm using Python 2.7.11 under Anaconda 2.4.1.  

Source, from
http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications
=

from pyspark import SparkContext
logFile = "README.md"  # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()
print("Lines with a: %i, lines with b: %i" % (numAs, numBs))

Error output =

Traceback (most recent call last):
  File "c:/Users/Peter/spark-1.5.2-bin-hadoop2.4/SimpleApp.py", line 3, in
<module>
    sc = SparkContext("local", "Simple App")
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\pyspark.zip\pyspark\context.py",
line 113, in __init__
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\pyspark.zip\pyspark\context.py",
line 170, in _do_init
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\pyspark.zip\pyspark\context.py",
line 224, in _initialize_context
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py",
line 701, in __call__
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py",
line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException

	at java.lang.ProcessBuilder.start(Unknown Source)

	at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)

	at org.apache.hadoop.util.Shell.run(Shell.java:418)

	at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)

	at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)

	at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)

	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)

	at org.apache.spark.SparkContext.addFile(SparkContext.scala:1387)

	at org.apache.spark.SparkContext.addFile(SparkContext.scala:1341)

	at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:484)

	at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:484)

	at scala.collection.immutable.List.foreach(List.scala:318)

	at org.apache.spark.SparkContext.<init>(SparkContext.scala:484)

	at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)

	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Source)

	at java.lang.reflect.Constructor.newInstance(Unknown Source)

	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)

	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)

	at py4j.Gateway.invoke(Gateway.java:214)

	at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)

	at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)

	at py4j.GatewayConnection.run(GatewayConnection.java:207)

	at java.lang.Thread.run(Unknown Source)



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/1-5-2-prebuilt-for-2-4-spark-submit-standalone-Python-scripts-not-running-tp25804.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message