spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ZhuGe <t...@outlook.com>
Subject workers no route to host
Date Tue, 31 Mar 2015 07:12:30 GMT
Hi,i set up a standalone cluster of 5 machines(tmaster, tslave1,2,3,4) with spark-1.3.0-cdh5.4.0-snapshort.
when i execute the sbin/start-all.sh, the master is ok, but i cant see the web ui. Moreover,
the worker logs is something like this:
Spark assembly has been built with Hive, including Datanucleus jars on classpath/data/PlatformDep/cdh5/dist/bin/compute-classpath.sh:
line 164: hadoop: command not foundSpark Command: java -cp :/data/PlatformDep/cdh5/dist/sbin/../conf:/data/PlatformDep/cdh5/dist/lib/spark-assembly-1.3.0-cdh5.4.0-SNAPSHOT-hadoop2.6.0-cdh5.4.0-SNAPSHOT.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-rdbms-3.2.1.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-api-jdo-3.2.1.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-core-3.2.2.jar:
-XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.worker.Worker
spark://192.168.128.16:7071 --webui-port 8081========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties15/03/31 06:47:22
INFO Worker: Registered signal handlers for [TERM, HUP, INT]15/03/31 06:47:23 WARN NativeCodeLoader:
Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable15/03/31 06:47:23 INFO SecurityManager: Changing view acls to: dcadmin15/03/31 06:47:23
INFO SecurityManager: Changing modify acls to: dcadmin15/03/31 06:47:23 INFO SecurityManager:
SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dcadmin);
users with modify permissions: Set(dcadmin)15/03/31 06:47:23 INFO Slf4jLogger: Slf4jLogger
started15/03/31 06:47:23 INFO Remoting: Starting remoting15/03/31 06:47:23 INFO Remoting:
Remoting started; listening on addresses :[akka.tcp://sparkWorker@tslave2:60815]15/03/31 06:47:24
INFO Utils: Successfully started service 'sparkWorker' on port 60815.15/03/31 06:47:24 INFO
Worker: Starting Spark worker tslave2:60815 with 2 cores, 3.0 GB RAM15/03/31 06:47:24 INFO
Worker: Running Spark version 1.3.015/03/31 06:47:24 INFO Worker: Spark home: /data/PlatformDep/cdh5/dist15/03/31
06:47:24 INFO Server: jetty-8.y.z-SNAPSHOT15/03/31 06:47:24 INFO AbstractConnector: Started
SelectChannelConnector@0.0.0.0:808115/03/31 06:47:24 INFO Utils: Successfully started service
'WorkerUI' on port 8081.15/03/31 06:47:24 INFO WorkerWebUI: Started WorkerWebUI at http://tslave2:808115/03/31
06:47:24 INFO Worker: Connecting to master akka.tcp://sparkMaster@192.168.128.16:7071/user/Master...15/03/31
06:47:24 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@tslave2:60815] ->
[akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]]
[akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]Caused
by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No route to host]15/03/31
06:47:24 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@tslave2:60815] ->
[akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]]
[akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]Caused
by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No route to host]15/03/31
06:47:24 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@tslave2:60815] ->
[akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]]
[akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]Caused
by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No route to host]15/03/31
06:47:24 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@tslave2:60815] ->
[akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]]
[akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster@192.168.128.16:7071]


the worker machines ping the master machine successfully. the hosts is like this:192.168.128.16
tmaster tmaster192.168.128.17 tslave1 tslave1192.168.128.18 tslave2 tslave2192.168.128.19
tslave3 tslave3192.168.128.20 tslave4 tslave4
Hope someone could help. Thanks 		 	   		  
Mime
View raw message