spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SRK <swethakasire...@gmail.com>
Subject Effective ways monitor and identify that a Streaming job has been failing for the last 5 minutes
Date Tue, 01 Dec 2015 15:45:00 GMT
Hi,

We need to monitor and identify if the Streaming job has been failing for
the last 5 minutes and restart the job accordingly.  In most cases our Spark
Streaming with Kafka direct fails with leader lost errors. Or offsets not
found errors for that partition. What is the most effective way to monitor
and identify that the Streamjng job has been failing with an error . The
default monitoring provided by Spark does not seem to cover the case to
check if the job has been failing for a specific time or am I missing
something and this feature is already available?

Thanks,
Swetha



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Effective-ways-monitor-and-identify-that-a-Streaming-job-has-been-failing-for-the-last-5-minutes-tp25536.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message