Do we have any high availability support in Spark driver level? For example, if we want spark drive can move to another node continue execution when failure happen. I can see the RDD checkpoint can help to serialization the status of RDD. I can image to load the check point from another node when error happen, but seems like will lost track all tasks status or even executor information that maintain in spark context. I am not sure if there is any existing stuff I can leverage to do that. thanks for any suggests

Best Regards

 
Jun Feng Liu
IBM China Systems & Technology Laboratory in Beijing


2D barcode - encoded with contact information Phone: 86-10-82452683
E-mail:
liujunf@cn.ibm.com
IBM

BLD 28,ZGC Software Park
No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
China