spark.cores.maxconfiguration property in it, or change the default for applications that don’t set this setting through
spark.deploy.defaultCores. Finally, in addition to controlling cores, each application’s
spark.executor.memorysetting controls its memory use.
Okay:host=my.local.serverport=someportThis is the spark-submit command, which runs on my local server:$SPARK_HOME/bin/spark-submit --master spark://$host:$port --executor-memory 4g python-script.py with argsIf I want 200 worker cores, I tell the cluster scheduler to run this command on 200 cores:$SPARK_HOME/sbin/start-slave.sh --cores=1 --memory=4g spark://$host:$portThat's it. When the task starts, it uses all available workers. If for some reason, not enough cores are available immediately, it still starts processing with whatever it gets and the load will be spread further as workers come online.On Thu, May 19, 2016 at 8:24 PM Mich Talebzadeh <email@example.com> wrote:In a normal operation we tell spark which node the worker processes can run by adding the nodenames to conf/slaves.Not very clear on this in your case all the jobs run locally with say 100 executor cores like below:
--master local[*] \
--driver-memory xg \ --default would be 512M
--num-executors=1 \ -- This is the constraint in stand-alone Spark cluster, whether specified or not
--executor-memory=xG \ --
--executor-cores=n \--master local[*] means all cores and --executor-cores in your case need not be specified? or you can cap it like above --executor-cores=n. If it is not specified then the Spark app will go and grab every core. Although in practice that does not happen it is just an upper ceiling. It is FIFO.What typical executor memory is specified in your case?Do you have a sample snapshot of spark-submit job by any chance Mathieu?CheersOn 20 May 2016 at 00:27, Mathieu Longtin <firstname.lastname@example.org> wrote:Mostly, the resource management is not up to the Spark master.We routinely start 100 executor-cores for 5 minute job, and they just quit when they are done. Then those processor cores can do something else entirely, they are not reserved for Spark at all.On Thu, May 19, 2016 at 4:55 PM Mich Talebzadeh <email@example.com> wrote:Then in theory every user can fire multiple spark-submit jobs. do you cap it with settings in $SPARK_HOME/conf/spark-defaults.conf , but I guess in reality every user submits one job only.This is an interesting model for two reasons:
- It uses parallel processing across all the nodes or most of the nodes to minimise the processing time
- it requires less interventionOn 19 May 2016 at 21:33, Mathieu Longtin <firstname.lastname@example.org> wrote:Driver memory is default. Executor memory depends on job, the caller decides how much memory to use. We don't specify --num-executors as we want all cores assigned to the local master, since they were started by the current user. No local executor. --master=spark://localhost:someport. 1 core per executor.On Thu, May 19, 2016 at 4:12 PM Mich Talebzadeh <email@example.com> wrote:Thanks MathieuSo it would be interesting to see what resources allocated in your case, especially the num-executors and executor-cores. I gather every node has enough memory and cores.
--master local \
--driver-memory 4g \
--executor-cores=2 \On 19 May 2016 at 21:02, Mathieu Longtin <firstname.lastname@example.org> wrote:The driver (the process started by spark-submit) runs locally. The executors run on any of thousands of servers. So far, I haven't tried more than 500 executors.Right now, I run a master on the same server as the driver.On Thu, May 19, 2016 at 3:49 PM Mich Talebzadeh <email@example.com> wrote:ok so you are using some form of NFS mounted file system shared among the nodes and basically you start the processes through spark-submit.In Stand-alone mode, a simple cluster manager included with Spark. It does the management of resources so it is not clear to me what you are referring as worker manager here?This is my take from your model.The application will go and grab all the cores in the cluster.You only have one worker that lives within the driver JVM process.The Driver node runs on the same host that the cluster manager is running. The Driver requests the Cluster Manager for resources to run tasks. In this case there is only one executor for the Driver? The Executor runs tasks for the Driver.
HTHOn 19 May 2016 at 20:37, Mathieu Longtin <firstname.lastname@example.org> wrote:No master and no node manager, just the processes that do actual work.We use the "stand alone" version because we have a shared file system and a way of allocating computing resources already (Univa Grid Engine). If an executor were to die, we have other ways of restarting it, we don't need the worker manager to deal with it.On Thu, May 19, 2016 at 3:16 PM Mich Talebzadeh <email@example.com> wrote:Hi MathieuWhat does this approach provide that the norm lacks?So basically each node has its master in this model.Are these supposed to be individual stand alone servers?ThanksOn 19 May 2016 at 18:45, Mathieu Longtin <firstname.lastname@example.org> wrote:First a bit of context:We use Spark on a platform where each user start workers as needed. This has the advantage that all permission management is handled by the OS, so the users can only read files they have permission to.To do this, we have some utility that does the following:- start a master- start worker managers on a number of servers- "submit" the Spark driver program- the driver then talks to the master, tell it how many executors it needs- the master tell the worker nodes to start executors and talk to the driver- the executors are startedFrom here on, the master doesn't do much, neither do the process manager on the worker nodes.What I would like to do is simplify this to:- Start the driver program- Start executors on a number of servers, telling them where to find the driver- The executors connect directly to the driverIs there a way I could do this without the master and worker managers?Thanks!------------