Greetings,

This is a scenario that we need to come up with a comprehensive answers to fulfil please.

If we have 6 spark VMs each running two executors via spark-submit.

  1.  we have two VMs failures at H/W level, rack failure
  2. we lose 4 executors of spark out of 12
  3. Happening half way through the spark-submit job

So my humble questions are:

  1. Will there be any data lost from the final result due to missing nodes?
  2. How will RDD lineage will handle this?
  3. Will there be any delay in getting the final result?
  4. How the driver will handle these two nodes failure
  5. Will there be additional executors added to the existing nodes or the existing executors will handle the job of 4 failing executors.
  6. If running in client mode and the node holding the driver dies?
  7. If running in cluster mode happens

Did search in Google no satisfactory answers gurus, hence turning to forum.

Best

A.K.