spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shivaram Venkataraman <shiva...@eecs.berkeley.edu>
Subject Re: SparkR 1.4.0: read.df() function fails
Date Tue, 16 Jun 2015 17:39:17 GMT
The error you are running into is that the input file does not exist -- You
can see it from the following line
"Input path does not exist: hdfs://smalldata13.hdp:8020/
home/esten/ami/usaf.json"

Thanks
Shivaram

On Tue, Jun 16, 2015 at 1:55 AM, esten <erik.stensrud@dnvgl.com> wrote:

> Hi,
> In SparkR shell, I invoke:
> > mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json",
> > header="false")
> I have tried various filetypes (csv, txt), all fail.
>
> RESPONSE: "ERROR RBackendHandler: load on 1 failed"
> BELOW THE WHOLE RESPONSE:
> 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(177600) called with
> curMem=0, maxMem=278302556
> 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0 stored as values in
> memory (estimated size 173.4 KB, free 265.2 MB)
> 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(16545) called with
> curMem=177600, maxMem=278302556
> 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0_piece0 stored as
> bytes
> in memory (estimated size 16.2 KB, free 265.2 MB)
> 15/06/16 08:09:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
> on localhost:37142 (size: 16.2 KB, free: 265.4 MB)
> 15/06/16 08:09:13 INFO SparkContext: Created broadcast 0 from load at
> NativeMethodAccessorImpl.java:-2
> 15/06/16 08:09:16 WARN DomainSocketFactory: The short-circuit local reads
> feature cannot be used because libhadoop cannot be loaded.
> 15/06/16 08:09:17 ERROR RBackendHandler: load on 1 failed
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
>
> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127)
>         at
>
> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74)
>         at
>
> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36)
>         at
>
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>         at
>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>         at
>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>         at
>
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>         at
>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>         at
>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>         at
>
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>         at
>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>         at
>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>         at
>
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>         at
>
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>         at
>
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at
>
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at
>
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at
>
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does
> not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json
>         at
>
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
>         at
>
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
>         at
>
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
>         at
> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>         at
>
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>         at
>
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>         at
> org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1069)
>         at
>
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
>         at
>
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
>         at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
>         at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1067)
>         at
> org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
>         at
>
> org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
>         at
>
> org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138)
>         at scala.Option.getOrElse(Option.scala:120)
>         at
>
> org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137)
>         at
> org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137)
>         at
>
> org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30)
>         at
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
>         at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230)
>         ... 25 more
> Error: returnStatus == 0 is not TRUE
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-1-4-0-read-df-function-fails-tp23333.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message