Hi Axel,

You can try setting `spark.deploy.spreadOut` to false (through your conf/spark-defaults.conf file). What this does is essentially try to schedule as many cores on one worker as possible before spilling over to other workers. Note that you *must* restart the cluster through the sbin scripts.

For more information see: http://spark.apache.org/docs/latest/spark-standalone.html.

Feel free to let me know whether it works,
-Andrew


2015-08-18 4:49 GMT-07:00 Igor Berman <igor.berman@gmail.com>:
by default standalone creates 1 executor on every worker machine per application
number of overall cores is configured with --total-executor-cores
so in general if you'll specify --total-executor-cores=1 then there would be only 1 core on some executor and you'll get what you want

on the other hand, if you application needs all cores of your cluster and only some specific job should run on single executor there are few methods to achieve this
e.g. coallesce(1) or dummyRddWithOnePartitionOnly.foreachPartition


On 18 August 2015 at 01:36, Axel Dahl <axel@whisperstream.com> wrote:
I have a 4 node cluster and have been playing around with the num-executors parameters, executor-memory and executor-cores

I set the following:
--executor-memory=10G
--num-executors=1
--executor-cores=8

But when I run the job, I see that each worker, is running one executor which has  2 cores and 2.5G memory.

What I'd like to do instead is have Spark just allocate the job to a single worker node?

Is that possible in standalone mode or do I need a job/resource scheduler like Yarn to do that?

Thanks in advance,

-Axel