spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From KhajaAsmath Mohammed <mdkhajaasm...@gmail.com>
Subject High availability with Spark
Date Sat, 16 Jul 2016 16:13:41 GMT
Hi,

could you please share your thoughts if anyone has idea on the below
topics.

   - How to achieve high availability with spark cluster? I have referred
   to the link *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/exercises/spark-exercise-standalone-master-ha.html
   <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/exercises/spark-exercise-standalone-master-ha.html>*
   . is there any other way to do in cluster mode?
   - How to achieve high availability of spark driver? I have gone through
   documentation that it is achieved through check pointing directory. is
   there any other way?
   - what is the procedure to know the number of messages that have been
   consumed by the consumer? is there any way to tack the number of messages
   consumed in spark streaming.
   - I also want to save data from the spark streaming periodically and do
   the aggregation on that. lets say, save date for every hour/day etc and do
   aggregations on that.


Thanks,
Asmath.

Mime
View raw message