spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lalwani, Jayesh" <jlalw...@amazon.com.INVALID>
Subject Re: Recovery when two spark nodes out of 6 fail
Date Fri, 25 Jun 2021 17:15:54 GMT
“Does this mean that only those tasks that the died executor was executing at the time need
to be rerun to generate the processing stages. I read somewhere that RDD lineage keeps track
of records of what needs to be re-executed.”

It uses RDD lineage to figure out what needs to be re-executed. I don’t know the details
how it determines which tasks to run, but I’m guessing that it in a multi-stage job, it
might have to rerun all the stages again. For example, if you have done a groupBy, you will
have 2 stages. After the first stage, the data will be shuffled by hashing the groupBy key
, so that data for the same value of key lands in same partition. Now, if one of those partitions
is lost during execution of second stage, I am guessing Spark will have to go back and re-execute
all the tasks in the first stage.

From: "ashok34668@yahoo.com.INVALID" <ashok34668@yahoo.com.INVALID>
Date: Friday, June 25, 2021 at 12:57 PM
To: "user@spark.apache.org" <user@spark.apache.org>, "Lalwani, Jayesh" <jlalwani@amazon.com.INVALID>
Subject: RE: [EXTERNAL] Recovery when two spark nodes out of 6 fail


CAUTION: This email originated from outside of the organization. Do not click links or open
attachments unless you can confirm the sender and know the content is safe.


Thank you for detailed explanation.

Please on below:

 .... If one executor fails, it moves the processing over to other executor. However, if the
data is lost, it re-executes the processing that generated the data, and might have to go
back to the source.

Does this mean that only those tasks that the died executor was executing at the time need
to be rerun to generate the processing stages. I read somewhere that RDD lineage keeps track
of records of what needs to be re-executed.

best

On Friday, 25 June 2021, 16:23:32 BST, Lalwani, Jayesh <jlalwani@amazon.com.invalid>
wrote:



Spark replicates the partitions among multiple nodes. If one executor fails, it moves the
processing over to other executor. However, if the data is lost, it re-executes the processing
that generated the data, and might have to go back to the source.



In case of failure, there will be delay in getting results. The amount of delay depends on
how much reprocessing Spark needs to do.



When the driver executes an action, it submits a job to the Cluster Manager. The Cluster Manager
starts submitting tasks to executors and monitoring them. In case, executors dies, the Cluster
Manager does the work of reassigning the tasks. While all of this is going on, the driver
is just sitting there waiting for the action to complete. SO, driver does nothing, really.
The Cluster Manager is doing most of the work of managing the workload



Spark, by itself doesn’t add executors when executors fail. It just moves the tasks to other
executors. If you are installing plain vanilla Spark on your own cluster, you need to figure
out how to bring back executors. Most of the popular platforms built on top of Spark (Glue,
EMR, Kubernetes) will replace failed nodes. You need to look into the capabilities of your
chosen platform.



If the driver dies, the Spark job dies. There’s no recovering from that. The only way to
recover is to run the job again. Batch jobs do not have benchmarking. So, they will need to
reprocess everything from the beginning. You need to write your jobs to be idempotent; ie;
rerunning them shouldn’t change the outcome. Streaming jobs have benchmarking, and they
will start from the last microbatch. This means that they might have to repeat the last microbatch.



From: "ashok34668@yahoo.com.INVALID" <ashok34668@yahoo.com.INVALID>
Date: Friday, June 25, 2021 at 10:38 AM
To: "user@spark.apache.org" <user@spark.apache.org>

Subject: [EXTERNAL] Recovery when two spark nodes out of 6 fail



CAUTION: This email originated from outside of the organization. Do not click links or open
attachments unless you can confirm the sender and know the content is safe.




Greetings,



This is a scenario that we need to come up with a comprehensive answers to fulfil please.



If we have 6 spark VMs each running two executors via spark-submit.



  1.   we have two VMs failures at H/W level, rack failure
  2.  we lose 4 executors of spark out of 12
  3.  Happening half way through the spark-submit job
  4.

So my humble questions are:



  1.  Will there be any data lost from the final result due to missing nodes?
  2.  How will RDD lineage will handle this?
  3.  Will there be any delay in getting the final result?
  4.  How the driver will handle these two nodes failure
  5.  Will there be additional executors added to the existing nodes or the existing executors
will handle the job of 4 failing executors.
  6.  If running in client mode and the node holding the driver dies?
  7.  If running in cluster mode happens



Did search in Google no satisfactory answers gurus, hence turning to forum.



Best



A.K.
Mime
View raw message