spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashic Mahtab <>
Subject Executors idle, driver heap exploding and maxing only 1 cpu core
Date Thu, 23 May 2019 14:36:14 GMT
We have a quite long winded Spark application we inherited with many stages. When we run on
our spark cluster, things start off well enough. Workers are busy, lots of progress made,
etc. etc. However, 30 minutes into processing, we see CPU usage of the workers drop drastically.
At this time, we also see that the driver is maxing out exactly one core (though we've given
it more than one), and its ram usage is creeping up. At this time, there's no logs coming
out on the driver. Everything seems to stop, and then it suddenly starts working, and the
workers start working again. The driver ram doesn't go down, but flatlines. A few minutes
later, the same thing happens again - the world seems to stop. However, the driver soon crashes
with an out of memory exception.

What could be causing this sort of behaviour on the driver? We don't have any collect() or
similar functions in the code. We're reading in from Azure blobs, processing, and writing
back to Azure blobs. Where should we start in trying to get to the bottom of this? We're running
Spark 2.4.1 in a stand-alone cluster.


View raw message