spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: What I am missing from configuration?
Date Mon, 27 Jan 2014 18:13:39 GMT
Hi Dana,

I think the problem is that your simple.sbt does not add a dependency on hadoop-client for
CDH4, so you get a different version of the Hadoop library on your driver application compared
to the cluster. Try adding a dependency on hadoop-client version 2.0.0-mr1-cdh4.X.X for your
version of CDH4, as well as the following line to add the resolver:

resolvers += "Cloudera Repository"  at "https://repository.cloudera.com/artifactory/cloudera-repos/“

Matei

On Jan 24, 2014, at 2:53 AM, Dana Tontea <dt@cylex.ro> wrote:

> I am completely new to Spark. 
> I want to run the exemples from here:   
> https://spark.incubator.apache.org/docs/0.8.1/quick-start.html
> <https://spark.incubator.apache.org/docs/0.8.1/quick-start.html>   from
> section "A Standalone App in Scala".
> When I run local with type of scheduler= local scheduler
>     val sc = new SparkContext("local[2]", "Simple App",
> "/home/*spark-0.8.1-incubating-bin-cdh4*",
>                    List("target/scala-2.9.3/simple-project_2.9.3-1.0.jar"))
> I get the result ok. But when I replace with url master from webUI
> (spark://192.168.6.66:7077)
>      val sc = new SparkContext("spark://192.168.6.66:7077", "Simple
> App","/home/*spark-0.8.1-incubating-bin-cdh4*",
> List("target/scala-2.9.3/simple-project_2.9.3-1.0.jar"))
> I get a long error:
> Starting task 0.0:1 as TID 6 on executor 0: ro-mysql5.cylex.local
> (PROCESS_LOCAL)
> 14/01/23 17:02:48 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1
> as 1801 bytes in 1 ms
> 14/01/23 17:02:48 WARN cluster.ClusterTaskSetManager: Lost TID 5 (task
> 0.0:0)
> 14/01/23 17:02:48 INFO cluster.ClusterTaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 5]
> The entire log error you can find in atached file. Error
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n878/Error>  
> 
> Can somebody explain what I am missing and what's the differences from these
> 2 schedulers: local[2] and spark://192.168.6.66:7077 ?  Why I can not see in
> webUI (http://localhost:8080/) the job when run with local[2].
> Here  SimpleJob.scala
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n878/SimpleJob.scala>
 
> are the code from scala and sbt  simple.sbt
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n878/simple.sbt> 
> .
> 
> And can please somebody to show me where I can find  a step-by-step
> tutorial or a course about how setup correctly a cluster and how acces it
> from IDE :IntelliJ IDEA. 
> Thanks in advanced!
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/What-I-am-missing-from-configuration-tp878.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.


Mime
View raw message