OK this is basically form my notes for Spark standalone. Worker process is the slave process

Inline images 2

You start worker as you showed


Now that picks up the worker host node names from $SPARK_HOME/conf/slaves files. So you still have to tell Spark where to run workers.

However, if I am correct regardless of what you have specified in slaves, in this standalone mode there will not be any spark process spawned by the driver on the slaves. In all probability you will be running one spark-submit process on the driver node. You can see this through the output of

jps|grep SparkSubmit 

and you will see the details by running jmonitor for that SparkSubmit job

However, I still doubt whether Scheduling Across applications is feasible in standalone mode.

The doc says

Standalone mode: By default, applications submitted to the standalone mode cluster will run in FIFO (first-in-first-out) order, and each application will try to use all available nodes. You can limit the number of nodes an application uses by setting the spark.cores.max configuration property in it, or change the default for applications that don’t set this setting through spark.deploy.defaultCores. Finally, in addition to controlling cores, each application’s spark.executor.memory setting controls its memory use.

It uses the word all available nodes but I am not convinced if it will use those nodes? Someone can possibly clarify this


On 20 May 2016 at 02:03, Mathieu Longtin <mathieu@closetwork.org> wrote:

This is the spark-submit command, which runs on my local server:
$SPARK_HOME/bin/spark-submit --master spark://$host:$port --executor-memory 4g python-script.py with args

If I want 200 worker cores, I tell the cluster scheduler to run this command on 200 cores:
$SPARK_HOME/sbin/start-slave.sh --cores=1 --memory=4g spark://$host:$port 

That's it. When the task starts, it uses all available workers. If for some reason, not enough cores are available immediately, it still starts processing with whatever it gets and the load will be spread further as workers come online.

On Thu, May 19, 2016 at 8:24 PM Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
In a normal operation we tell spark which node the worker processes can run by adding the nodenames to conf/slaves.

Not very clear on this in your case all the jobs run locally with say 100 executor cores like below:

${SPARK_HOME}/bin/spark-submit \

                --master local[*] \

                --driver-memory xg \  --default would be 512M

                --num-executors=1 \   -- This is the constraint in stand-alone Spark cluster, whether specified or not

                --executor-memory=xG \ --

                --executor-cores=n \

--master local[*] means all cores and --executor-cores in your case need not be specified? or you can cap it like above --executor-cores=n. If it is not specified then the Spark app will go and grab every core. Although in practice that does not happen it is just an upper ceiling. It is FIFO.

What typical executor memory is specified in your case?

Do you have a  sample snapshot of spark-submit job by any chance Mathieu?

On 20 May 2016 at 00:27, Mathieu Longtin <mathieu@closetwork.org> wrote:
Mostly, the resource management is not up to the Spark master. 

We routinely start 100 executor-cores for 5 minute job, and they just quit when they are done. Then those processor cores can do something else entirely, they are not reserved for Spark at all.

On Thu, May 19, 2016 at 4:55 PM Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
Then in theory every user can fire multiple spark-submit jobs. do you cap it with settings in  $SPARK_HOME/conf/spark-defaults.conf , but I guess in reality every user submits one job only.

This is an interesting model for two reasons:

  • It uses parallel processing across all the nodes or most of the nodes to minimise the processing time
  • it requires less intervention

On 19 May 2016 at 21:33, Mathieu Longtin <mathieu@closetwork.org> wrote:
Driver memory is default. Executor memory depends on job, the caller decides how much memory to use. We don't specify --num-executors as we want all cores assigned to the local master, since they were started by the current user. No local executor.  --master=spark://localhost:someport. 1 core per executor. 

On Thu, May 19, 2016 at 4:12 PM Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
Thanks Mathieu

So it would be interesting to see what resources allocated in your case, especially the num-executors and executor-cores. I gather every node has enough memory and cores.


${SPARK_HOME}/bin/spark-submit \

                --master local[2] \

                --driver-memory 4g \

                --num-executors=1 \

                --executor-memory=4G \

                --executor-cores=2 \

On 19 May 2016 at 21:02, Mathieu Longtin <mathieu@closetwork.org> wrote:
The driver (the process started by spark-submit) runs locally. The executors run on any of thousands of servers. So far, I haven't tried more than 500 executors.

Right now, I run a master on the same server as the driver.

On Thu, May 19, 2016 at 3:49 PM Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
ok so you are using some form of NFS mounted file system shared among the nodes and basically you start the processes through spark-submit.

In Stand-alone mode, a simple cluster manager included with Spark. It does the management of resources so it is not clear to me what you are referring as worker manager here?

This is my take from your model.
 The application will go and grab all the cores in the cluster.
You only have one worker that lives within the driver JVM process.
The Driver node runs on the same host that the cluster manager is running. The Driver requests the Cluster Manager for resources to run tasks. In this case there is only one executor for the Driver? The Executor runs tasks for the Driver.

On 19 May 2016 at 20:37, Mathieu Longtin <mathieu@closetwork.org> wrote:
No master and no node manager, just the processes that do actual work.

We use the "stand alone" version because we have a shared file system and a way of allocating computing resources already (Univa Grid Engine). If an executor were to die, we have other ways of restarting it, we don't need the worker manager to deal with it.

On Thu, May 19, 2016 at 3:16 PM Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
Hi Mathieu

What does this approach provide that the norm lacks?

So basically each node has its master in this model.

Are these supposed to be individual stand alone servers?


On 19 May 2016 at 18:45, Mathieu Longtin <mathieu@closetwork.org> wrote:
First a bit of context: 
We use Spark on a platform where each user start workers as needed. This has the advantage that all permission management is handled by the OS, so the users can only read files they have permission to.

To do this, we have some utility that does the following:
- start a master
- start worker managers on a number of servers
- "submit" the Spark driver program
- the driver then talks to the master, tell it how many executors it needs
- the master tell the worker nodes to start executors and talk to the driver
- the executors are started

From here on, the master doesn't do much, neither do the process manager on the worker nodes.

What I would like to do is simplify this to:
- Start the driver program
- Start executors on a number of servers, telling them where to find the driver
- The executors connect directly to the driver 

Is there a way I could do this without the master and worker managers?


Mathieu Longtin

Mathieu Longtin

Mathieu Longtin

Mathieu Longtin

Mathieu Longtin

Mathieu Longtin