spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From peteranolaN <>
Subject 1.5.2 prebuilt for 2.4 spark-submit standalone Python scripts not running
Date Sun, 27 Dec 2015 00:27:10 GMT
Hi all, 

Question from a newbie here about your excellent Spark:  

I've just installed Spark 1.5.2, pre-built for Hadoop 2.4 and later.  I'm
trying to go through the introductory documentation using local[4] to begin
with.  In pyspark, I'm able to use examples  such as the simple application
at PROVIDED that I remove the sc initialisation.  Otherwise, if I try to run
any Python script using spark-submit, I get the verbose error message I show
below and no output.  I am not able to fix this. 

Any assistance would be very gratefully received.  

My machine runs Windows 10 HOME, with 8GB ram on a 64 bit Intel Core i3-@
3.4 gHz.  I'm using Python 2.7.11 under Anaconda 2.4.1.  

Source, from

from pyspark import SparkContext
logFile = ""  # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()
print("Lines with a: %i, lines with b: %i" % (numAs, numBs))

Error output =

Traceback (most recent call last):
  File "c:/Users/Peter/spark-1.5.2-bin-hadoop2.4/", line 3, in
    sc = SparkContext("local", "Simple App")
line 113, in __init__
line 170, in _do_init
line 224, in _initialize_context
line 701, in __call__
line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
: java.lang.NullPointerException

	at java.lang.ProcessBuilder.start(Unknown Source)

	at org.apache.hadoop.util.Shell.runCommand(



	at org.apache.hadoop.fs.FileUtil.chmod(

	at org.apache.hadoop.fs.FileUtil.chmod(

	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)

	at org.apache.spark.SparkContext.addFile(SparkContext.scala:1387)

	at org.apache.spark.SparkContext.addFile(SparkContext.scala:1341)

	at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:484)

	at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:484)

	at scala.collection.immutable.List.foreach(List.scala:318)

	at org.apache.spark.SparkContext.<init>(SparkContext.scala:484)


	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)

	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown

	at java.lang.reflect.Constructor.newInstance(Unknown Source)

	at py4j.reflection.MethodInvoker.invoke(

	at py4j.reflection.ReflectionEngine.invoke(

	at py4j.Gateway.invoke(


	at py4j.commands.ConstructorCommand.execute(


	at Source)

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message