I can’t speak to Mesos solutions, but for YARN you can define queues in which to run your jobs, and you can customize the amount of resources the queue consumes.  When deploying your Spark job, you can specify the —queue <queue_name> option to schedule the job to a particular queue.  Here are some links for reference:

http://hadoop.apache.org/docs/r2.5.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
http://lucene.472066.n3.nabble.com/Capacity-Scheduler-on-YARN-td4081242.html



From: Luis Guerra <luispelayo84@gmail.com>
Date: Tuesday, January 13, 2015 at 3:19 AM
To: "Buttler, David" <buttler1@llnl.gov>, user <user@spark.apache.org>
Subject: Re: Spark executors resources. Blocking?

Thanks for your answer David,

It is as I thought then. When you write that there could be some approaches to solve this using Yarn or Mesos, can you give any idea about this? Or better yet, is there any site with documentation about this issue? Currently, we are launching our jobs using Yarn, but still we do not know how to properly schedule our jobs to have the highest utilization of our cluster.

Best

On Tue, Jan 13, 2015 at 2:12 AM, Buttler, David <buttler1@llnl.gov> wrote:

Spark has a built-in cluster manager (The spark stand-alone cluster), but it is not designed for multi-tenancy –either multiple people using the system, or multiple tasks sharing resources.  It is a first in, first out queue of tasks where tasks will block until the previous tasks are finished (as you described).  If you want to have higher utilization of your cluster, then you should use either Yarn or Mesos to schedule the system.  The same issues will come up, but they have a much broader range of approaches that you can take to solve the problem.

 

Dave

 

From: Luis Guerra [mailto:luispelayo84@gmail.com]
Sent: Monday, January 12, 2015 8:36 AM
To: user
Subject: Spark executors resources. Blocking?

 

Hello all,

 

I have a naive question regarding how spark uses the executors in a cluster of machines. Imagine the scenario in which I do not know the input size of my data in execution A, so I set Spark to use 20 (out of my 25 nodes, for instance). At the same time, I also launch a second execution B, setting Spark to use 10 nodes for this.

 

Assuming a huge input size for execution A, which implies an execution time of 30 minutes for example (using all the resources), and a constant execution time for B of 10 minutes, then both executions will last for 40 minutes (I assume that B cannot be launched until 10 resources are completely available, when A finishes).

 

Now, assuming a very small input size for execution A running for 5 minutes in only 2 of the 20 planned resources, I would like execution B to be launched at that time, consuming both executions only 10 minutes (and 12 resources). However, as execution A has set Spark to use 20 resources, execution B has to wait until A has finished, so the total execution time lasts for 15 minutes.

 

Is this right? If so, how can I solve this kind of scenarios? If I am wrong, what would be the correct interpretation for this?

 

Thanks in advance, 

 

Best