spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chintan (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-12551) Not able to load CSV package
Date Tue, 29 Dec 2015 16:02:49 GMT

     [ https://issues.apache.org/jira/browse/SPARK-12551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

chintan resolved SPARK-12551.
-----------------------------
          Resolution: Fixed
       Fix Version/s: 1.5.2
    Target Version/s: 1.5.2

for the above i finally found out solution for the above. Need to make sure following

You have java development kit installed, you can download from website download this and save
it to C:/hadoop In this bin folder should be like C:/hadoop/bin

Set up JAVA_HOME in environment variable(dont mention bin folder here) set up HADOOP_HOME
as environment variable(dont mention bin folder here)

now run following

rm(list=ls())
  # Set the system environment variables


Sys.setenv(SPARK_HOME = "C:/spark")
Sys.setenv(HADOOP_HOME = "C:/Hadoop")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))


#load the Sparkr library
library(rJava)
library(SparkR)


Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.11:1.2.0" "sparkr-shell"')

Sys.setenv(SPARK_MEM="1g")


# Create a spark context and a SQL context
sc <- sparkR.init(master = "local")

sqlContext <- sparkRSQL.init(sc)

> Not able to load CSV package
> ----------------------------
>
>                 Key: SPARK-12551
>                 URL: https://issues.apache.org/jira/browse/SPARK-12551
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 1.5.2
>         Environment: Rstudio under Windows
>            Reporter: chintan
>             Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When ever i am trying load CSV package spark dont work, it gives Invoke java error
> Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.2.0" "sparkr-shell"')
> > Sys.setenv(SPARK_MEM="1g")
> > sc <- sparkR.init(master = "local")
> Launching java with spark-submit command C:/spark/bin/spark-submit.cmd   "--packages"
"com.databricks:spark-csv_2.10:1.2.0" "sparkr-shell" C:\Users\shahch07\AppData\Local\Temp\RtmpigvXMn\backend_port98840b15c5a

> > sqlContext <- sparkRSQL.init(sc)
> > DF <- createDataFrame(sqlContext, faithful)
> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
>   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage
0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.NullPointerException
> 	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
> 	at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
> 	at org.apache.hadoop.util.Shell.run(Shell.java:455)
> 	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
> 	at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
> 	at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
> 	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)
> 	at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:405)
> 	at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:397)
> 	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message