spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Kurz <christian.k...@oracle.com>
Subject REST-API for Killing a Streaming Application
Date Thu, 24 Mar 2016 12:51:25 GMT
Hello Spark Streaming Gurus,

for better automation I need to manage my Spark Streaming Applications 
remotely. These applications read from Kafka and therefore have a 
receiver job and are started via spark-submit. For now I have only found 
a REST-API for killing Spark applications remotely, but that only works 
if the Spark application runs on a Standalone Spark Server.

Q1: Is it correct that the Spark WebUI does not provide any way of 
killing an application via a REST call? Should I file a JIRA for this?


My second question/ problem is around the fact that even for the 
Standalone Spark Server I am unable to get timely/complete shutdowns:

I see long delays until the spark-submit OS process terminates after the 
kill has been sent. This delay happens even though spark-submit 
immediately recognizes the kill on stdout:

org.apache.spark.SparkException: Job aborted due to stage failure: 
Master removed our application: KILLED

But then hangs and to me it looks like the Spark receiver is never shut 
down and may even be running for ever on the Spark Standalone server: 
the application is marked in status "KILLED", but the link "Application 
Detail UI" still works and shows that "Streaming job running receiver 0 
(Job 2)" is still running.

Note 1: when I kill the same spark-submit job using Ctrl-C, the 
application immediately stops as expected. Including the shutdown of 
Kafka Receiver. On the Spark Standalone server the application is then 
in status "FINISHED" and the link "Application Detail UI" takes me to 
"Application history not found (app-20160324022723-0015)".

Q2: Should the REST-API call have (immediately) shut down my 
spark-submit job? Should I file a JIRA for this problem?


Background: Even thought Ctrl-C on spark-submit seems to work fine, this 
is no options for my Spark automation. Besides general design concerns 
my monitoring JVM may have been restarted after spark-submit has been 
started and therefore I cannot rely on the monitoring application to 
have access to the spark-submit OS process.


Any thoughts are much appreciated,
Christian

Mime
View raw message