spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Diego Fanesi <>
Subject worker connected to standalone cluster are continuously crashing
Date Tue, 21 Mar 2017 02:55:10 GMT
Hello everybody,

I configured a simple standalone cluster with few machines and I am trying
to submit a very simple job just to test the cluster.

my laptop is the client and one of the workers. my server contains the
master and the second worker.

If I submit my job just executing the scala code from intellij without
using spark-submit the job runs correctly, but this is a not supported
procedure and many operations are not working properly, such as udf.

If I try to use spark-submit the worker in my laptop is running correctly
but the second worker in my server continuously exits with the stack trace
attached at the end of my email.

I did the following attempts to try to solve the issue:
- I forced the resolution of the hostname to the ip address of the network
- I replaced open jdk with oracle jdk
- I defined SPARK_LOCAL_IP to the IP of the network card in
- I changed /etc/nsswitch.conf to resolve hostnames through the internal
dns server first.
- I created a vm and I separated the master from the worker. the master
works but the vm containing the worker keeps failing. installing three
workers on three VMs I get the same issue.

all systems are arch linux and I can't understand why my laptop is behaving
correctly while every other system I install present the same problem.
Doing a ping to the hostname I can see that after my configuration it is
not being resolved as anymore, so this configuration should be

I really can't understand what I am doing wrong.

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:202)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$
	at Method)
	... 4 more
Caused by: Failed to connect to /
	at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
	at org.apache.spark.rpc.netty.Outbox$$anon$
	at org.apache.spark.rpc.netty.Outbox$$anon$
	at java.util.concurrent.ThreadPoolExecutor.runWorker(
	at java.util.concurrent.ThreadPoolExecutor$
Caused by:$AnnotatedConnectException:
Connection refused: /
	at Method)
	at io.netty.util.concurrent.SingleThreadEventExecutor$
	at io.netty.util.concurrent.DefaultThreadFactory$
	... 1 more

View raw message