spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From asma zgolli <>
Subject How to develop a listener that collects the statistics for spark sql and execution time for each operator
Date Tue, 19 Mar 2019 16:57:21 GMT
Hello ,

I'm looking for a way to develop a listener that collects the statistics
for spark sql queries as well as the execution time for each physical
operator of the physical plan and store them in a database.

I want to develop an application similar to the following application :
import org.apache.spark.scheduler._
import org.apache.log4j.LogManager
val logger = LogManager.getLogger("CustomListener")
class CustomListener extends SparkListener {
override def onStageCompleted(stageCompleted: SparkListenerStageCompleted):
Unit = {
logger.warn(s"Stage completed, runTime:
${stageCompleted.stageInfo.taskMetrics.executorRunTime}, " +
s"cpuTime: ${stageCompleted.stageInfo.taskMetrics.executorCpuTime}")
val myListener=new CustomListener
//sc is the active Spark Context
// run a simple Spark job and note the additional warning messages emitted
by the CustomLister with
// Spark execution metrics, for exmaple run
spark.time(sql("select count(*) from range(1e4) cross join

but for spark sql runtime statistics.
I want to store the same statistics as the ones displayed in :
the ones stored in the picture attached.

thank you very much

yours sincerely,

PhD student in data engineering - computer science
Email :
email alt:

View raw message