spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saurabh Pateriya <saurabh.pater...@happiestminds.com>
Subject Help realted with spark streaming usage error
Date Tue, 09 Dec 2014 13:44:48 GMT
Hi ,

I am running spark streaming in standalone mode on twitter source into single machine(using
HDP virtual box)  I receive status from streaming context and I can print the same but when
I try to save those statuses as RDD into Hadoop using rdd.saveAsTextFiles or saveAsHadoopFiles("hdfs://10.20.32.204:50070/user/hue/test","txt")
I get below connection error.
My Hadoop version:2.4.0.2.1.1.0-385
Spark 1.1.0

ERROR-----------------------------------------------------

14/12/09 04:45:12 ERROR scheduler.JobScheduler: Error running job streaming job 1418129110000
ms.1
java.io.IOException: Call to /10.20.32.204:50070 failed on local exception: java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
        at org.apache.hadoop.ipc.Client.call(Client.java:1075)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
        at org.apache.hadoop.mapred.SparkHadoopWriter$.createPathFromString(SparkHadoopWriter.scala:193)
        at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:685)
        at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:572)
        at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:894)
        at org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:762)
        at org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:760)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at scala.util.Try$.apply(Try.scala:161)
        at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:155)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:811)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749)
[error] (run-main-0) java.io.IOException: Call to /10.20.32.204:50070 failed on local exception:
java.io.EOFException
java.io.IOException: Call to /10.20.32.204:50070 failed on local exception: java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
        at org.apache.hadoop.ipc.Client.call(Client.java:1075)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
        at org.apache.hadoop.mapred.SparkHadoopWriter$.createPathFromString(SparkHadoopWriter.scala:193)
        at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:685)
        at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:572)
        at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:894)
        at org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:762)
        at org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:760)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at scala.util.Try$.apply(Try.scala:161)
        at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:155)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:811)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749)
[trace] Stack trace suppressed: run last compile:run for the full output.





Thanks,
Saurabh
________________________________
Happiest Minds Disclaimer

This message is for the sole use of the intended recipient(s) and may contain confidential,
proprietary or legally privileged information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the original intended recipient of the message,
please contact the sender by reply email and destroy all copies of the original message.

Happiest Minds Technologies <http://www.happiestminds.com>

________________________________

Mime
View raw message