spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh H (Marketing Platform-BLR)" <rakes...@flipkart.com>
Subject Graceful shutdown of spark streaming on yarn
Date Thu, 12 May 2016 06:12:56 GMT
Issue i am having is similar to the one mentioned here :
http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn

I am creating a rdd from sequence of 1 to 300 and creating streaming RDD
out of it.

val rdd = ssc.sparkContext.parallelize(1 to 300)
val dstream = new ConstantInputDStream(ssc, rdd)
dstream.foreachRDD{ rdd =>
  rdd.foreach{ x =>
    log(x)
    Thread.sleep(50)
  }
}


When i kill this job, i expect elements 1 to 300 to be logged before
shutting down. It is indeed the case when i run it locally. It wait for the
job to finish before shutting down.

But when i launch the job in custer with "yarn-cluster" mode, it abruptly
shuts down.
Executor prints following log

ERROR executor.CoarseGrainedExecutorBackend:
Driver xx.xx.xx.xxx:yyyyy disassociated! Shutting down.

 and then it shuts down. It is not a graceful shutdown.

Anybody knows how to do it in yarn ?

Mime
View raw message