spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: Spark job resource allocation best practices
Date Mon, 03 Nov 2014 15:46:40 GMT
Have a look at scheduling pools
<https://spark.apache.org/docs/latest/job-scheduling.html>. If you want
more sophisticated resource allocation, then you are better of to use
cluster managers like mesos or yarn

Thanks
Best Regards

On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman <romi@totango.com> wrote:

> Hello,
>
> I have a Spark 1.1.0 standalone cluster, with several nodes, and several
> jobs (applications) being scheduled at the same time.
> By default, each Spark job takes up all available CPUs.
> This way, when more than one job is scheduled, all but the first are stuck
> in "WAITING".
> On the other hand, if I tell each job to initially limit itself to a fixed
> number of CPUs, and that job runs by itself, the cluster is under-utilized
> and the job runs longer than it could have if it took all the available
> resources.
>
> - How to give the tasks a more fair resource division, which lets many
> jobs run together, and together lets them use all the available resources?
> - How do you divide resources between applications on your usecase?
>
> P.S. I started reading about Mesos but couldn't figure out if/how it could
> solve the described issue.
>
> Thanks!
>
> *Romi Kuntsman*, *Big Data Engineer*
>  http://www.totango.com
>

Mime
View raw message