What CLI args are your referring to?  I'm aware of spark-submit's arguments (--executor-memory, --total-executor-cores, and --executor-cores)

On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <jiyan@drive.ai> wrote:
I have done a experiment on this today. It shows that only CPUs are tolerant of insufficient cluster size when a job starts. On my cluster, I have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos ) with --cpu_cores set to 1000, the job starts up with 64 cores. but when I set --memory to 200Gb, the job fails to start with "Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources"

Also it is confusing to me that --cpu_cores specifies the number of cpu cores across all executors, but --memory specifies per executor memory requirement.

On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <mgummelt@mesosphere.io> wrote:


On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <jiyan@drive.ai> wrote:
Tasks begin scheduling as soon as the first executor comes up

Thanks all for the clarification. Is this the default behavior of Spark on Mesos today? I think this is what we are looking for because sometimes a job can take up lots of resources and later jobs could not get all the resources that it asks for. If a Spark job starts with only a subset of resources that it asks for, does it know to expand its resources later when more resources become available?

Yes.
 

Launch each executor with at least 1GB RAM, but if mesos offers 2GB at some moment, then launch an executor with 2GB RAM

This is less useful in our use case. But I am also quite interested in cases in which this could be helpful. I think this will also help with overall resource utilization on the cluster if when another job starts up that has a hard requirement on resources, the extra resources to the first job can be flexibly re-allocated to the second job. 

On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <mgummelt@mesosphere.io> wrote:
We've talked about that, but it hasn't become a priority because we haven't had a driving use case.  If anyone has a good argument for "variable" resource allocation like this, please let me know.

On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2012@gmail.com> wrote:
An alternative behavior is to launch the job with the best resource offer Mesos is able to give

Michael has just made an excellent explanation about dynamic allocation support in mesos. But IIUC, what you want to achieve is something like (using RAM as an example) : "Launch each executor with at least 1GB RAM, but if mesos offers 2GB at some moment, then launch an executor with 2GB RAM".

I wonder what's benefit of that? To reduce the "resource fragmentation"?

Anyway, that is not supported at this moment. In all the supported cluster managers of spark (mesos, yarn, standalone, and the up-to-coming spark on kubernetes), you have to specify the cores and memory of each executor.

It may not be supported in the future, because only mesos has the concepts of offers because of its two-level scheduling model.


On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <jiyan@drive.ai> wrote:
Dear Spark Users,

Currently is there a way to dynamically allocate resources to Spark on Mesos? Within Spark we can specify the CPU cores, memory before running job. The way I understand is that the Spark job will not run if the CPU/Mem requirement is not met. This may lead to decrease in overall utilization of the cluster. An alternative behavior is to launch the job with the best resource offer Mesos is able to give. Is this possible with the current implementation?

Thanks
Ji

The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful.





--
Michael Gummelt
Software Engineer
Mesosphere


The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful.




--
Michael Gummelt
Software Engineer
Mesosphere


The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful.




--
Michael Gummelt
Software Engineer
Mesosphere