spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomer Benyamini <>
Subject Driver zombie process (standalone cluster)
Date Wed, 29 Jun 2016 07:05:36 GMT

I'm trying to run spark applications on a standalone cluster, running on
top of AWS. Since my slaves are spot instances, in some cases they are
being killed and lost due to bid prices. When apps are running during this
event, sometimes the spark application dies - and the driver process just
hangs, and stays up forever (zombie process), capturing memory / cpu
resources on the master machine. Then we have to manually kill -9 to free
these resources.

Has anyone seen this kind of problem before? Any suggested solution to work
around this problem?


View raw message