spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From S <>
Subject Understanding life cycle of RpcEndpoint: CoarseGrainedExecutorBackend
Date Wed, 18 Dec 2019 13:31:27 GMT
I am trying to understand the lifecycle of an RPCEndpoint.

Here is my understanding: After negotiating containers form the
ClusterManager, the master starts the CoarseGrainedExecutorBackend on the
worker which connects back to the CoarseGrainedSchedulerBackend's
DriverEndpoint which sends requests/messages to the

*Q1: My inference is the lifecyle of CoarseGrainedExecutorBackend is:
onConnected() -> onStart() -> receive -> onStop(). The receive() method
keeps taking the requests/messages and executing them meaning that the
receive() method is called multiple times throughout its lifecycle. Is my
understanding right?*

*Q2: The receive method executes "messages/requests" as per the source code
What exactly are these messages/requests? Is it referring to the "set of
tasks on assigned to this particular RPCEndpoint" from a stage of a spark
RDD on its individual partitions?*

*Q3: If the receive method is indeed called multiple times through the
course of a spark job where each request refers to the set of task(s) of a
stage, then does this mean a new Executor is instantiated when the
receive() method is called (as the code suggests in line 129
which in turn happens every time a stage is executed and a set of tasks are
sent to a particular RPCEndpoint (CoarseGrainedExecutorBackend) after
shutting down the executor from the previous stage?*

I have put this question up on SO as well @

It would be a lot of help if one could elaborate and shed light on these

View raw message