spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajay Singal <asinga...@gmail.com>
Subject Re: Controlling number of executors on Mesos vs YARN
Date Thu, 13 Aug 2015 14:10:32 GMT
Tim,

The ability to specify fine-grain configuration could be useful for many
reasons.  Let's take an example of a node with 32 cores.  All of first, as
per my understanding, having 5 executors each with 6 cores will almost
always perform better than having a single executor with 30 cores .  Also,
these 5 executors could be a) used by the same application, or b) shared
amongst multiple applications.  In case of single executor with 30 cores,
some of the slots/core could be wasted if there are less number of Tasks
(from a single application) to be executed.

As I said, applications can specify desirable number of executors.  If not
available, Mesos (in a simple implementation) can provide/offer whatever is
available. In a slightly complex implementation, we can build a simple
protocol to negotiate.

Regards,
Ajay

On Wed, Aug 12, 2015 at 5:51 PM, Tim Chen <tim@mesosphere.io> wrote:

> You're referring to both fine grain and coarse grain?
>
> Desirable number of executors per node could be interesting but it can't
> be guaranteed (or we could try to and when failed abort the job).
>
> How would you imagine this new option to actually work?
>
>
> Tim
>
> On Wed, Aug 12, 2015 at 11:48 AM, Ajay Singal <asingal11@gmail.com> wrote:
>
>> Hi Tim,
>>
>> An option like spark.mesos.executor.max to cap the number of executors
>> per node/application would be very useful.  However, having an option like spark.mesos.executor.num
>> to specify desirable number of executors per node would provide even/much
>> better control.
>>
>> Thanks,
>> Ajay
>>
>> On Wed, Aug 12, 2015 at 4:18 AM, Tim Chen <tim@mesosphere.io> wrote:
>>
>>> Yes the options are not that configurable yet but I think it's not hard
>>> to change it.
>>>
>>> I have a patch out actually specifically able to configure amount of
>>> cpus per executor in coarse grain mode, and hopefully merged next release.
>>>
>>> I think the open question now is for fine grain mode can we limit the
>>> number of maximum concurrent executors, and I think we can definitely just
>>> add a new option like spark.mesos.executor.max to cap it.
>>>
>>> I'll file a jira and hopefully to get this change in soon too.
>>>
>>> Tim
>>>
>>>
>>>
>>> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
>>> aharipriya92@gmail.com> wrote:
>>>
>>>> Spark evolved as an example framework for Mesos - thats how I know it.
>>>> It is surprising to see that the options provided by mesos in this case are
>>>> less. Tweaking the source code, haven't done it yet but I would love to see
>>>> what options could be there!
>>>>
>>>> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam <chilinglam@gmail.com>
>>>> wrote:
>>>>
>>>>> My experience with Mesos + Spark is not great. I saw one executor with
>>>>> 30 CPU and the other executor with 6. So I don't think you can easily
>>>>> configure it without some tweaking at the source code.
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>>>>> aharipriya92@gmail.com> wrote:
>>>>>
>>>>> Hi Tim,
>>>>>
>>>>> Spark on Yarn allows us to do it using --num-executors and
>>>>> --executor_cores commandline arguments. I just got a chance to look at
a
>>>>> similar spark user list mail, but no answer yet. So does mesos allow
>>>>> setting the number of executors and cores? Is there a default number
it
>>>>> assumes?
>>>>>
>>>>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <tim@mesosphere.io> wrote:
>>>>>
>>>>>> Forgot to hit reply-all.
>>>>>>
>>>>>> ---------- Forwarded message ----------
>>>>>> From: Tim Chen <tim@mesosphere.io>
>>>>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>>>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>>>>> To: mvle <mvle@us.ibm.com>
>>>>>>
>>>>>>
>>>>>> Hi Mike,
>>>>>>
>>>>>> You're correct there is no such setting in for Mesos coarse grain
>>>>>> mode, since the assumption is that each node is launched with one
container
>>>>>> and Spark is launching multiple tasks in that container.
>>>>>>
>>>>>> In fine-grain mode there isn't a setting like that, as it currently
>>>>>> will launch an executor as long as it satisfies the minimum container
>>>>>> resource requirement.
>>>>>>
>>>>>> I've created a JIRA earlier about capping the number of executors
or
>>>>>> better distribute the # of executors launched in each node. Since
the
>>>>>> decision of choosing what node to launch containers is all in the
Spark
>>>>>> scheduler side, it's very easy to modify it.
>>>>>>
>>>>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mvle@us.ibm.com> wrote:
>>>>>>
>>>>>>> I'm trying to compare the performance of Spark running on Mesos
vs
>>>>>>> YARN.
>>>>>>> However, I am having problems being able to configure the Spark
>>>>>>> workload to
>>>>>>> run in a similar way on Mesos and YARN.
>>>>>>>
>>>>>>> When running Spark on YARN, you can specify the number of executors
>>>>>>> per
>>>>>>> node. So if I have a node with 4 CPUs, I can specify 6 executors
on
>>>>>>> that
>>>>>>> node. When running Spark on Mesos, there doesn't seem to be an
>>>>>>> equivalent
>>>>>>> way to specify this. In Mesos, you can somewhat force this by
>>>>>>> specifying the
>>>>>>> number of CPU resources to be 6 when running the slave daemon.
>>>>>>> However, this
>>>>>>> seems to be a static configuration of the Mesos cluster rather
>>>>>>> something
>>>>>>> that can be configured in the Spark framework.
>>>>>>>
>>>>>>> So here is my question:
>>>>>>>
>>>>>>> For Spark on Mesos, am I correct that there is no way to control
the
>>>>>>> number
>>>>>>> of executors per node (assuming an idle cluster)? For Spark on
Mesos
>>>>>>> coarse-grained mode, there is a way to specify max_cores but
that is
>>>>>>> still
>>>>>>> not equivalent to specifying the number of executors per node
as
>>>>>>> when Spark
>>>>>>> is run on YARN.
>>>>>>>
>>>>>>> If I am correct, then it seems Spark might be at a disadvantage
>>>>>>> running on
>>>>>>> Mesos compared to YARN (since it lacks the fine tuning ability
>>>>>>> provided by
>>>>>>> YARN).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>>> Nabble.com.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Haripriya Ayyalasomayajula
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Haripriya Ayyalasomayajula
>>>>
>>>>
>>>
>>
>

Mime
View raw message