spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: Spark Quick Start - call to open needs explicit fs prefix
Date Mon, 24 Feb 2014 02:57:24 GMT
Good catch; the Spark cluster on EC2 is configured to use HDFS as its default filesystem, so
it can’t find this file. The quick start was written to run on a single machine with an
out-of-the-box install. If you’d like to upload this file to the HDFS cluster on EC2, use
the following command:

~/ephemeral-hdfs/bin/hadoop fs -put


On Feb 23, 2014, at 6:33 PM, nicholas.chammas <> wrote:

> I just deployed Spark 0.9.0 to EC2 using the guide here. I then turned to the Quick Start
guide here and walked through it using the Python shell.
> When I do this:
> >>> textFile = sc.textFile("")
> >>> textFile.count()
> I get a long error output right after the count() that includes this:
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://
> So I guess Spark assumed that the file was in HDFS. 
> To get the file open and count to work, I had to do this:
> >>> textFile = sc.textFile("file:///root/spark/")
> >>> textFile.count()
> I get the same results if I use the Scala shell.
> Does the quick start guide need to updated, or did I miss something?
> Nick
> View this message in context: Spark Quick Start - call to open needs explicit
fs prefix
> Sent from the Apache Spark User List mailing list archive at

View raw message