spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Hemnani <>
Subject Spark workers keep getting disconnected(Keep dying) from the cluster.
Date Thu, 15 May 2014 08:04:03 GMT

I am facing a weird issue. 

My spark workers keep dying every now and then and in the master logs i keep
on seeing following messages,

 14/05/14 10:09:24 WARN Master: Removing worker-20140514080546-x.x.x.x-50737
because we got no heartbeat in 60 seconds
14/05/14 14:18:41 WARN Master: Removing worker-20140514123848-x.x.x.x-50901
because we got no heartbeat in 60 seconds

In my cluster, I have one master node and four worker nodes. 

On the cluster i am trying to run shark and related queries. 

I tried setting the property, spark.worker.timeout=300 on all workers and
master but still it shows, 60 seconds timeout. 

After that, i keep seeing the following messages as well,

14/05/14 16:59:52 INFO Master: Removing app app-20140514164003-0009

On the worker nodes, in the work folder, i cant seem to find any suspicious

Any help as to what is causing all this. 

View this message in context:
Sent from the Apache Spark User List mailing list archive at

View raw message