spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Starting SPARK application in cluster mode from an IDE
Date Sat, 27 Feb 2016 00:39:51 GMT
Hi,

The problem description is mentioned below - why should I be able to create
the SPARK application using Python and not SCALA (using an IDE like
IntelliJ or Eclipse)

SPARK Environment:
-----------------------------
SPARK Version: 1.6.0
OS: MAC OS X 10.11.3
IDE:  IntelliJ


Created a SBT project in IntelliJ using the details in this page:
---------------------------------------------------------------------------------
http://spark.apache.org/docs/latest/quick-start.html


The following code in SCALA fails to create an application in locally
running SPARK cluster (set by running ./sbin/start-master.sh and
./sbin/start-slaves.sh):

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf


object test {
  def main(args: Array[String]) {
    //the below line returns nothing
    println(SparkContext.jarOfClass(this.getClass).toString())
    val logFile = "/tmp/README.md" // Should be some file on your system
    val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("spark://systemhostname:7077")
    //val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}


The following code runs fine
--------------------------------------

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf


object test {
  def main(args: Array[String]) {
    //the below line returns nothing
    println(SparkContext.jarOfClass(this.getClass).toString())
    val logFile = "/tmp/README.md" // Should be some file on your system
    //val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("spark://systemhostname:7077")
    val conf = new SparkConf().setAppName("IdeaProjects").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}


But creating an application using Python works quite fine, as the following
code runs fine:
-----------------------------------------------------------------------------------------------------------------------
from pyspark import SparkConf, SparkContext
conf =
SparkConf().setMaster("spark://systemhostname:7077").setAppName("test").set("spark.executor.memory",
"1g").set("spark.executor.cores", "2")
conf.getAll()
sc = SparkContext(conf = conf)


Further description and links to this issue is mentioned here:
http://stackoverflow.com/questions/33222045/classnotfoundexception-anonfun-when-deploy-scala-code-to-spark


Thanks and Regards,
Gourav Sengupta

Mime
View raw message