spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bart Vercammen <bart.vercam...@portico.io>
Subject Spark 0.8.0 on Mesos 0.13.0 (clustered) : NoClassDefFoundError
Date Sat, 12 Oct 2013 09:26:46 GMT
Hi,

I have an issue getting Spark jobs to run on a mesos cluster.
(most probably it's a config issue - I hope - but let me explain what I 
did) :

- installed mesos on a cluster cluster (1 master and 3 workers) with 
zookeeper support.
- mesos is running fine :
   curl 'http://mesos-master:5050/master/state.json 
<http://mesos-master:5050/master/state.json>' | python -mjson.tool shows 
me the master and the slaves
- then, as mentioned in the spark readme, I created a Spark distribution 
with 'make-distribution.sh' and uploaded it to HDFS (=> 
spark/spark-0.8.0-incubating.tar.g)
- I configured the environment variables on all the instances :
    * 
SPARK_EXECUTOR_URI="hdfs://hdfs-namenode:8020/spark/spark-0.8.0-incubating.tar.gz"
  * MESOS_NATIVE_LIBRARY="/usr/local/lib/libmesos.so"

When I start spark-shell, it starts up fine,
MASTER=mesos://mesos-master:5050 ./spark-shell
      / __/__  ___ _____/ /__
     _\ \/ _ \/ _ `/ __/  '_/
    /___/ .__/\_,_/_/ /_/\_\   version 0.8.0
       /_/

Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_45)
Initializing interpreter...
13/10/11 15:04:34 INFO server.Server: jetty-7.x.y-SNAPSHOT
13/10/11 15:04:34 INFO server.AbstractConnector: 
StartedSocketConnector@0.0.0.0:38491 <http://SocketConnector@0.0.0.0:38491/>
Creating SparkContext...
13/10/11 15:04:50 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
13/10/11 15:04:50 INFO spark.SparkEnv: Registering BlockManagerMaster
13/10/11 15:04:50 INFO storage.MemoryStore: MemoryStore started with 
capacity 326.7 MB.
13/10/11 15:04:50 INFO storage.DiskStore: Created local directory at 
/tmp/spark-local-20131011150450-3616
13/10/11 15:04:50 INFO network.ConnectionManager: Bound socket to port 
36220 with id = ConnectionManagerId(ip-*******,36220)
13/10/11 15:04:50 INFO storage.BlockManagerMaster: Trying to register 
BlockManager
13/10/11 15:04:50 INFO storage.BlockManagerMaster: Registered BlockManager
13/10/11 15:04:50 INFO server.Server: jetty-7.x.y-SNAPSHOT
13/10/11 15:04:50 INFO server.AbstractConnector: 
StartedSocketConnector@0.0.0.0:60233 <http://SocketConnector@0.0.0.0:60233/>
13/10/11 15:04:50 INFO broadcast.HttpBroadcast: Broadcast server started 
at http://*******:60233
13/10/11 15:04:50 INFO spark.SparkEnv: Registering MapOutputTracker
13/10/11 15:04:51 INFO spark.HttpFileServer: HTTP File server directory 
is /tmp/spark-ca7119f6-4190-4d93-83ed-0fda261f3071
13/10/11 15:04:51 INFO server.Server: jetty-7.x.y-SNAPSHOT
13/10/11 15:04:51 INFO server.AbstractConnector: 
StartedSocketConnector@0.0.0.0:44652 <http://SocketConnector@0.0.0.0:44652/>
13/10/11 15:04:51 INFO server.Server: jetty-7.x.y-SNAPSHOT
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/storage/rdd,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/storage,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/stages/stage,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/stages/pool,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/stages,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/environment,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/executors,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/metrics/json,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/static,null}
13/10/11 15:04:51 INFO handler.ContextHandler: started 
o.e.j.s.h.ContextHandler{/,null}
13/10/11 15:04:51 INFO server.AbstractConnector: 
StartedSelectChannelConnector@0.0.0.0:4040 
<http://SelectChannelConnector@0.0.0.0:4040/>
13/10/11 15:04:51 INFO ui.SparkUI: Started Spark Web UI athttp://ip- 
<http://ip-/>*******:4040
13/10/11 15:04:51 INFO mesos.MesosSchedulerBackend: Registered as 
framework ID 201310110648-2269968906-5050-12731-0007
Spark context available as sc.
Type in expressions to have them evaluated.
Type :help for more information.
/
/
/( the '*****' are of course the commented out IP's of the instances ;-)/

but when I want to launch a job (e.g. a word-count of something on 
HDFS), I can see the mesos-slaves shooting into action, and I also see 
the 'tasks' popping up in the mesos-UI,
but the tasks are failing with following error :
13/10/11 15:06:59 INFO cluster.ClusterTaskSetManager: Serialized task 
0.0:5 as 1581 bytes in 0 ms
13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Re-queueing tasks 
for 201310110648-2269968906-5050-12731-34 from TaskSet 0.0
13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Lost TID 23 (task 
0.0:5)
13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Lost TID 21 (task 
0.0:4)
13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Lost TID 22 (task 
0.0:1)
13/10/11 15:07:05 INFO scheduler.DAGScheduler: Executor lost: 
201310110648-2269968906-5050-12731-34 (epoch 6)
13/10/11 15:07:05 INFO storage.BlockManagerMasterActor: Trying to remove 
executor 201310110648-2269968906-5050-12731-34 from BlockManagerMaster.
13/10/11 15:07:05 INFO storage.BlockManagerMaster: Removed 
201310110648-2269968906-5050-12731-34 successfully in removeExecutor

In the mesos logs (in stderr on the mesos-slaves):
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/spark/executor/MesosExecutorBackend
Caused by: java.lang.ClassNotFoundException: 
org.apache.spark.executor.MesosExecutorBackend
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: 
org.apache.spark.executor.MesosExecutorBackend.  Program will exit.

Can someone explain what might be going on?

Also: I did not install spark on any of the mesos-slaves, as I am under 
the assumption that a Spark installation on the worker nodes is not 
needed anymore when working on top of Mesos, is this a correct 
assumption?  If not, how should I configure this then?

Thanks in advance.
Greets,
Bart

Mime
View raw message