spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Irina Fedulova <fedul...@gmail.com>
Subject Akka "connection refused" when running standalone Scala app on Spark 0.9.2
Date Fri, 03 Oct 2014 09:32:45 GMT
Hi,

I have set up Spark 0.9.2 standalone cluster using CDH5 and pre-built 
spark distribution archive for Hadoop 2. I was not using spark-ec2 
scripts because I am not on EC2 cloud.

Spark-shell seems to be working properly -- I am able to perform simple 
RDD operations, as well as e.g. SparkPi standalone example works well 
when run via `run-example`. Web UI shows all workers connected.

However, standalone Scala application gets "connection refused" 
messages. I think this has something to do with configuration, because 
spark-shell and SparkPi works well. I verified that .setMaster and 
.setSparkHome are properly assigned within scala app.

Is there anything else in configuration of standalone scala app on spark 
that I am missing?
I would very much appreciate any clues.

Namely, I am trying to run MovieLensALS.scala example from AMPCamp big 
data mini course 
(http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html).

Here is error which I get when try to run compiled jar:
---------------
root@master:~/machine-learning/scala# sbt/sbt package "run 
/movielens/medium"
Launching sbt from sbt/sbt-launch-0.12.4.jar
[info] Loading project definition from 
/root/training/machine-learning/scala/project
[info] Set current project to movielens-als (in build 
file:/root/training/machine-learning/scala/)
[info] Compiling 1 Scala source to 
/root/training/machine-learning/scala/target/scala-2.10/classes...
[warn] there were 2 deprecation warning(s); re-run with -deprecation for 
details
[warn] one warning found
[info] Packaging 
/root/training/machine-learning/scala/target/scala-2.10/movielens-als_2.10-0.0.jar 
...
[info] Done packaging.
[success] Total time: 6 s, completed Oct 2, 2014 1:19:00 PM
[info] Running MovieLensALS /movielens/medium
master = spark://master:7077
log4j:WARN No appenders could be found for logger 
(akka.event.slf4j.Slf4jLogger).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for 
more info.
14/10/02 13:19:01 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
HERE
THERE
14/10/02 13:19:02 INFO FileInputFormat: Total input paths to process : 1
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 0 on host2: 
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 1 (task 0.0:1)
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 0 (task 0.0:0)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 4 on host5: 
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 3 (task 0.0:1)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 1 on host4: 
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 2 (task 0.0:0)
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 4 (task 0.0:1)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 3 on host3: 
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 6 (task 0.0:0)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 2 on host1: 
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 5 (task 0.0:1)
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 7 (task 0.0:0)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 6 on host4: 
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 8 (task 0.0:0)
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 9 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 5 on host2: 
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 10 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 7 on host5: 
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 11 (task 0.0:0)
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 12 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 8 on host3: 
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 13 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 9 on host1: 
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 14 (task 0.0:0)
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 15 (task 0.0:1)
14/10/02 13:19:05 ERROR AppClient$ClientActor: Master removed our 
application: FAILED; stopping client
14/10/02 13:19:05 WARN SparkDeploySchedulerBackend: Disconnected from 
Spark cluster! Waiting for reconnection...
14/10/02 13:19:06 ERROR TaskSchedulerImpl: Lost executor 11 on host5: 
remote Akka client disassociated
14/10/02 13:19:06 WARN TaskSetManager: Lost TID 17 (task 0.0:0)
14/10/02 13:19:06 WARN TaskSetManager: Lost TID 16 (task 0.0:1)
---------------

And this is error log on one of the workers:
---------------
14/10/02 13:19:05 INFO worker.Worker: Executor app-20141002131901-0002/9 
finished with state FAILED message Command exited with code 1 exitStatus 1
14/10/02 13:19:05 INFO actor.LocalActorRef: Message 
[akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] 
from Actor[akka://sparkWorker/deadLetters] to 
Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40xxx.xx.xx.xx%3A57719-15#1504298502]

was not delivered. [6] dead letters encountered. This logging can be 
turned off or adjusted with configuration settings 
'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError 
[akka.tcp://sparkWorker@host1:47421] -> 
[akka.tcp://sparkExecutor@host1:45542]: Error [Association failed with 
[akka.tcp://sparkExecutor@host1:45542]] [
akka.remote.EndpointAssociationException: Association failed with 
[akka.tcp://sparkExecutor@host1:45542]
Caused by: 
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: 
Connection refused: host1/xxx.xx.xx.xx:45542
]
14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError 
[akka.tcp://sparkWorker@host1:47421] -> 
[akka.tcp://sparkExecutor@host1:45542]: Error [Association failed with 
[akka.tcp://sparkExecutor@host1:45542]] [
akka.remote.EndpointAssociationException: Association failed with 
[akka.tcp://sparkExecutor@host1:45542]
Caused by: 
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: 
Connection refused: host1/xxx.xx.xx.xx:45542
]
14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError 
[akka.tcp://sparkWorker@host1:47421] -> 
[akka.tcp://sparkExecutor@host1:45542]: Error [Association failed with 
[akka.tcp://sparkExecutor@host1:45542]] [
akka.remote.EndpointAssociationException: Association failed with 
[akka.tcp://sparkExecutor@host1:45542]
Caused by: 
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: 
Connection refused: host1/xxx.xx.xx.xx:45542
---------------

Thanks!
Irina

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message