spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sujeet Varakhedi <svarakh...@gopivotal.com>
Subject Re: Spark standalone network configuration problems
Date Fri, 27 Jun 2014 19:22:23 GMT
Looks like your driver is not able to connect to the remote executor on
machine2/130.49.226.148:60949.  Cn you check if the master machine can
route to 130.49.226.148

Sujeet


On Fri, Jun 27, 2014 at 12:04 PM, Shannon Quinn <squinn@gatech.edu> wrote:

> For some reason, commenting out spark.driver.host and spark.driver.port
> fixed something...and broke something else (or at least revealed another
> problem). For reference, the only lines I have in my spark-defaults.conf
> now:
>
> spark.app.name          myProg
> spark.master            spark://192.168.1.101:5060
> spark.executor.memory   8g
> spark.files.overwrite   true
>
> It starts up, but has problems with machine2. For some reason, machine2 is
> having trouble communicating with *itself*. Here are the worker logs of one
> of the failures (there are 10 before it quits):
>
>
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> 14/06/27 14:55:13 INFO ExecutorRunner: Launch command: "java" "-cp"
> "::/home/spark/spark-1.0.0-bin-hadoop2/conf:/home/spark/
> spark-1.0.0-bin-hadoop2/lib/spark-assembly-1.0.0-hadoop2.
> 2.0.jar:/home/spark/spark-1.0.0-bin-hadoop2/lib/datanucleus-
> rdbms-3.2.1.jar:/home/spark/spark-1.0.0-bin-hadoop2/lib/
> datanucleus-core-3.2.2.jar:/home/spark/spark-1.0.0-bin-
> hadoop2/lib/datanucleus-api-jdo-3.2.1.jar" "-XX:MaxPermSize=128m"
> "-Xms8192M" "-Xmx8192M" "org.apache.spark.executor.CoarseGrainedExecutorBackend"
> "akka.tcp://spark@machine1:46378/user/CoarseGrainedScheduler" "7"
> "machine2" "8" "akka.tcp://sparkWorker@machine2:48019/user/Worker"
> "app-20140627144512-0001"
> 14/06/27 14:56:54 INFO Worker: Executor app-20140627144512-0001/7 finished
> with state FAILED message Command exited with code 1 exitStatus 1
> 14/06/27 14:56:54 INFO LocalActorRef: Message [akka.remote.transport.
> ActorTransportAdapter$DisassociateUnderlying] from
> Actor[akka://sparkWorker/deadLetters] to Actor[akka://sparkWorker/
> system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%
> 2FsparkWorker%40130.49.226.148%3A53561-38#-1924573003] was not delivered.
> [10] dead letters encountered. This logging can be turned off or adjusted
> with configuration settings 'akka.log-dead-letters' and
> 'akka.log-dead-letters-during-shutdown'.
> 14/06/27 14:56:54 ERROR EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@machine2:48019] -> [akka.tcp://sparkExecutor@machine2:60949]:
> Error [Association failed with [akka.tcp://sparkExecutor@machine2:60949]]
> [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@machine2:60949]
> Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: machine2/130.49.226.148:60949
> ]
> 14/06/27 14:56:54 INFO Worker: Asked to launch executor
> app-20140627144512-0001/8 for Funtown, USA
> 14/06/27 14:56:54 ERROR EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@machine2:48019] -> [akka.tcp://sparkExecutor@machine2:60949]:
> Error [Association failed with [akka.tcp://sparkExecutor@machine2:60949]]
> [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@machine2:60949]
> Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: machine2/130.49.226.148:60949
> ]
> 14/06/27 14:56:54 ERROR EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@machine2:48019] -> [akka.tcp://sparkExecutor@machine2:60949]:
> Error [Association failed with [akka.tcp://sparkExecutor@machine2:60949]]
> [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@machine2:60949]
> Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: machine2/130.49.226.148:60949
> ]
>
> Port 48019 on machine2 is indeed open, connected, and listening. Any ideas?
>
> Thanks!
>
> Shannon
>
> On 6/27/14, 1:54 AM, sujeetv wrote:
>
>> Try to explicitly set set the "spark.driver.host" property to the master's
>> IP.
>> Sujeet
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Spark-standalone-network-configuration-problems-
>> tp8304p8396.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>
>

Mime
View raw message