The data may be spilled off to disk hence HDFS is a necessity for Spark. 
You can run Spark on a single machine & not use HDFS but in distributed mode HDFS will be required. 

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi



On Wed, Mar 19, 2014 at 4:10 AM, Sai Prasanna <ansaiprasanna@gmail.com> wrote:
Mayur,

While reading a local file which is not in HDFS through spark shell, does the HDFS need to be up and running ???


On Tue, Mar 18, 2014 at 9:46 PM, Mayur Rustagi <mayur.rustagi@gmail.com> wrote:
Your hdfs is down. Probably forgot to format namenode.
check if namenode is running
   ps -aef|grep Namenode
if not & data in hdfs is not critical
hadoop namenode -format
& restart hdfs




On Tue, Mar 18, 2014 at 5:59 AM, Sai Prasanna <ansaiprasanna@gmail.com> wrote:
Hi ALL !!

In the interactive spark shell i get the following error.
I just followed the steps of the video "First steps with spark - spark screen cast #1" by andy konwinski...

Any thoughts ???

scala> val textfile = sc.textFile("README.md")
textfile: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at <console>:12

scala> textfile.count
java.lang.RuntimeException: java.net.ConnectException: Call to master/192.168.1.11:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:546)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:291)
at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:439)
at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:439)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:112)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:112)
at scala.Option.map(Option.scala:133)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:112)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:134)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199)
at scala.Option.getOrElse(Option.scala:108)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:26)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199)
at scala.Option.getOrElse(Option.scala:108)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:886)
at org.apache.spark.rdd.RDD.count(RDD.scala:698)
at <init>(<console>:15)
at <init>(<console>:20)
at <init>(<console>:22)
at <init>(<console>:24)
at <init>(<console>:26)
at .<init>(<console>:30)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $export(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629)
at org.apache.spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:897)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Call to master/192.168.1.11:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
at org.apache.hadoop.ipc.Client.call(Client.java:1075)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at com.sun.proxy.$Proxy8.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:542)
... 39 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
at org.apache.hadoop.ipc.Client.call(Client.java:1050)
... 53 more


--
Sai Prasanna. AN
II M.Tech (CS), SSSIHL

Entire water in the ocean can never sink a ship, Unless it gets inside.
All the pressures of life can never hurt you, Unless you let them in.





--
Sai Prasanna. AN
II M.Tech (CS), SSSIHL

Entire water in the ocean can never sink a ship, Unless it gets inside.
All the pressures of life can never hurt you, Unless you let them in.