spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sun, Rui" <rui....@intel.com>
Subject RE: SparkR dataFrame read.df fails to read from aws s3
Date Thu, 09 Jul 2015 10:43:04 GMT
Hi, Ben,

For 1), another cause would be mismatch between SparkR R package and SparkR backend, which
un-likely happen. Do you use pre-built Spark binary or built on your own from latest master
branch code?

From: Sun, Rui [mailto:rui.sun@intel.com]
Sent: Thursday, July 9, 2015 5:51 PM
To: Ben Spark; user
Subject: RE: SparkR dataFrame read.df fails to read from aws s3

Hi, Ben


1)      I guess this may be a JDK version mismatch. Could you check the JDK version?

2)      I believe this is a bug in SparkR. I will fire a JIRA issue for it.

From: Ben Spark [mailto:ben_spark_1@yahoo.com.au]
Sent: Thursday, July 9, 2015 12:14 PM
To: user
Subject: SparkR dataFrame read.df fails to read from aws s3

I have Spark 1.4 deployed on AWS EMR but methods of SparkR dataFrame read.df method cannot
load data from aws s3.

1) "read.df" error message

 read.df(sqlContext,"s3://some-bucket/some.json","json")

15/07/09 04:07:01 ERROR r.RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed

java.lang.IllegalArgumentException: invalid method loadDF for object org.apache.spark.sql.api.r.SQLUtils

         at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:143)

         at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74)

         at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
2) "jsonFile" is working though with some warning message

Warning message:

In normalizePath(path) :

  path[1]="s3://rea-consumer-data-dev/cbr/profiler/output/20150618/part-00000": No such file
or directory
Mime
View raw message