spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@hacked.work>
Subject Re: Where does the Driver run?
Date Sun, 24 Mar 2019 04:26:38 GMT
If you are starting your "my-app" on your local machine, that's where the
driver is running.

[image: image.png]

Hope this helps.
<https://spark.apache.org/docs/latest/cluster-overview.html>

On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pat@occamsmachete.com> wrote:

> I have researched this for a significant amount of time and find answers
> that seem to be for a slightly different question than mine.
>
> The Spark 2.3.3 cluster is running fine. I see the GUI on “
> http://master-address:8080", there are 2 idle workers, as configured.
>
> I have a Scala application that creates a context and starts execution of
> a Job. I *do not use spark-submit*, I start the Job programmatically and
> this is where many explanations forks from my question.
>
> In "my-app" I create a new SparkConf, with the following code (slightly
> abbreviated):
>
>       conf.setAppName(“my-job")
>       conf.setMaster(“spark://master-address:7077”)
>       conf.set(“deployMode”, “cluster”)
>       // other settings like driver and executor memory requests
>       // the driver and executor memory requests are for all mem on the
> slaves, more than
>       // mem available on the launching machine with “my-app"
>       val jars = listJars(“/path/to/lib")
>       conf.setJars(jars)
>       …
>
> When I launch the job I see 2 executors running on the 2 workers/slaves.
> Everything seems to run fine and sometimes completes successfully. Frequent
> failures are the reason for this question.
>
> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
> taking all cluster resources. With a Yarn cluster I would expect the
> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
> Master, where is the Drive part of the Job running?
>
> If is is running in the Master, we are in trouble because I start the
> Master on one of my 2 Workers sharing resources with one of the Executors.
> Executor mem + driver mem is > available mem on a Worker. I can change this
> but need so understand where the Driver part of the Spark Job runs. Is it
> in the Spark Master, or inside and Executor, or ???
>
> The “Driver” creates and broadcasts some large data structures so the need
> for an answer is more critical than with more typical tiny Drivers.
>
> Thanks for you help!
>


-- 
Cheers!

Mime
View raw message