spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stensrud, Erik" <Erik.Stens...@dnvgl.com>
Subject Re: SparkR 1.4.0: read.df() function fails
Date Wed, 17 Jun 2015 16:12:33 GMT
Thanks to both of you!
You solved the problem.

Thanks
Erik Stensrud

Sendt fra min iPhone

Den 16. jun. 2015 kl. 20.23 skrev Guru Medasani <gdmeda@gmail.com<mailto:gdmeda@gmail.com>>:

Hi Esten,

Looks like your sqlContext is connected to a Hadoop/Spark cluster, but the file path you specified
is local?.

mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json”,

Error below shows that the Input path you specified does not exist on the cluster. Pointing
to the right hdfs path should be able to help here.

Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does
not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json


Guru Medasani
gdmeda@gmail.com<mailto:gdmeda@gmail.com>



On Jun 16, 2015, at 10:39 AM, Shivaram Venkataraman <shivaram@eecs.berkeley.edu<mailto:shivaram@eecs.berkeley.edu>>
wrote:

The error you are running into is that the input file does not exist -- You can see it from
the following line
"Input path does not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json"

Thanks
Shivaram

On Tue, Jun 16, 2015 at 1:55 AM, esten <erik.stensrud@dnvgl.com<mailto:erik.stensrud@dnvgl.com>>
wrote:
Hi,
In SparkR shell, I invoke:
> mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json",
> header="false")
I have tried various filetypes (csv, txt), all fail.

RESPONSE: "ERROR RBackendHandler: load on 1 failed"
BELOW THE WHOLE RESPONSE:
15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(177600) called with
curMem=0, maxMem=278302556
15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0 stored as values in
memory (estimated size 173.4 KB, free 265.2 MB)
15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(16545) called with
curMem=177600, maxMem=278302556
15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes
in memory (estimated size 16.2 KB, free 265.2 MB)
15/06/16 08:09:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
on localhost:37142 (size: 16.2 KB, free: 265.4 MB)
15/06/16 08:09:13 INFO SparkContext: Created broadcast 0 from load at
NativeMethodAccessorImpl.java:-2
15/06/16 08:09:16 WARN DomainSocketFactory: The short-circuit local reads
feature cannot be used because libhadoop cannot be loaded.
15/06/16 08:09:17 ERROR RBackendHandler: load on 1 failed
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127)
        at
org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74)
        at
org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36)
        at
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
        at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does
not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json
        at
org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
        at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
        at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
        at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
        at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
        at
org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1069)
        at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
        at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
        at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1067)
        at
org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
        at
org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
        at
org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138)
        at scala.Option.getOrElse(Option.scala:120)
        at
org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137)
        at
org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137)
        at
org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30)
        at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
        at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230)
        ... 25 more
Error: returnStatus == 0 is not TRUE



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-1-4-0-read-df-function-fails-tp23333.html
Sent from the Apache Spark User List mailing list archive at Nabble.com<http://Nabble.com>.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org<mailto:user-unsubscribe@spark.apache.org>
For additional commands, e-mail: user-help@spark.apache.org<mailto:user-help@spark.apache.org>




**************************************************************************************
This e-mail and any attachments thereto may contain confidential information and/or information
protected by intellectual property rights for the exclusive attention of the intended addressees
named above. If you have received this transmission in error, please immediately notify the
sender by return e-mail and delete this message and its attachments. Unauthorized use, copying
or further full or partial distribution of this e-mail or its contents is prohibited.
**************************************************************************************

Mime
View raw message