From Felix Cheung <>
Subject Re: Issue with SparkR setup on RStudio
Date Mon, 02 Jan 2017 18:59:14 GMT
Perhaps it is with


That you have in the sparkConfig parameter.

Unfortunately the exception stack is fairly far away from the actual error, but from the top
of my head spark.sql.warehouse.dir and HADOOP_HOME are the two different pieces that is not
set in the Windows tests.

From: Md. Rezaul Karim <<>>
Sent: Monday, January 2, 2017 7:58 AM
Subject: Re: Issue with SparkR setup on RStudio
To: Felix Cheung <<>>
Cc: spark users <<>>

Hello Cheung,

Happy New Year!

No, I did not configure Hive on my machine. Even I have tried not setting the HADOOP_HOME
but getting the same error.

Md. Rezaul Karim BSc, MSc
PhD Researcher, INSIGHT Centre for Data Analytics
National University of Ireland, Galway
IDA Business Park, Dangan, Galway, Ireland

On 29 December 2016 at 19:16, Felix Cheung <<>>
Any reason you are setting HADOOP_HOME?

>From the error it seems you are running into issue with Hive config likely with trying
to load hive-site.xml. Could you try not setting HADOOP_HOME

From: Md. Rezaul Karim <<>>
Sent: Thursday, December 29, 2016 10:24:57 AM
To: spark users
Subject: Issue with SparkR setup on RStudio

Dear Spark users,
I am trying to setup SparkR on RStudio to perform some basic data manipulations and MLmodeling.
 However, I am a strange error while creating SparkR session or DataFrame that says:java.lang.IllegalArgumentException
Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState.
According to Spark documentation at,
I don’t need to configure Hive path or related variables.
I have the following source code:

SPARK_HOME = "C:/spark-2.1.0-bin-hadoop2.7"
HADOOP_HOME= "C:/spark-2.1.0-bin-hadoop2.7/bin/"

library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sparkR.session(appName = "SparkR-DataFrame-example", master = "local[*]", sparkConfig = list(spark.sql.warehouse.dir="E:/Exp/",
spark.driver.memory = "8g"), enableHiveSupport = TRUE)

# Create a simple local data.frame
localDF <- data.frame(name=c("John", "Smith", "Sarah"), age=c(19, 23, 18))
# Convert local data frame to a SparkDataFrame
df <- createDataFrame(localDF)
Please note that the HADOOP_HOME contains the ‘winutils.exe’ file. The details of the
eror is as follows:

Error in handleErrors(returnStatus, conn) :  java.lang.IllegalArgumentException: Error while
instantiating 'org.apache.spark.sql.hive.HiveSessionState':

               at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)

               at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)

               at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)

               at org.apache.spark.sql.api.r.SQLUtils$$anonfun$setSparkContextSessionConf$2.apply(SQLUtils.scala:67)

               at org.apache.spark.sql.api.r.SQLUtils$$anonfun$setSparkContextSessionConf$2.apply(SQLUtils.scala:66)

               at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)

               at scala.collection.Iterator$class.foreach(Iterator.scala:893)

               at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)

               at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)

               at scala.collection.AbstractIterable.foreach(Iterable.scala:54)

               at scala.collection.Traversabl

 Any kind of help would be appreciated.

