spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ji yan <yanj...@gmail.com>
Subject Re: Dynamic resource allocation to Spark on Mesos
Date Fri, 03 Feb 2017 00:39:57 GMT
got it, thanks for clarifying!

On Thu, Feb 2, 2017 at 2:57 PM, Michael Gummelt <mgummelt@mesosphere.io>
wrote:

> Yes, that's expected.  spark.executor.cores sizes a single executor.  It
> doesn't limit the number of executors.  For that, you need spark.cores.max
> (--total-executor-cores).
>
> And rdd.parallelize does not specify the number of executors.  It
> specifies the number of partitions, which relates to the number of tasks,
> not executors.  Unless you're running with dynamic allocation enabled, the
> number of executors for your job is static, and determined at start time.
> It's not influenced by your job itself.
>
>
> On Thu, Feb 2, 2017 at 2:42 PM, Ji Yan <jiyan@drive.ai> wrote:
>
>> I tried setting spark.executor.cores per executor, but Spark seems to be
>> spinning up as many executors as possible up to spark.cores.max or however
>> many cpu cores available on the cluster, and this may be undesirable
>> because the number of executors in rdd.parallelize(collection, # of
>> partitions) is being overriden
>>
>> On Thu, Feb 2, 2017 at 1:30 PM, Michael Gummelt <mgummelt@mesosphere.io>
>> wrote:
>>
>>> As of Spark 2.0, Mesos mode does support setting cores on the executor
>>> level, but you might need to set the property directly (--conf
>>> spark.executor.cores=<cores>).  I've written about this here:
>>> https://docs.mesosphere.com/1.8/usage/service-guides/spark/j
>>> ob-scheduling/.  That doc is for DC/OS, but the configuration is the
>>> same.
>>>
>>> On Thu, Feb 2, 2017 at 1:06 PM, Ji Yan <jiyan@drive.ai> wrote:
>>>
>>>> I was mainly confused why this is the case with memory, but with cpu
>>>> cores, it is not specified on per executor level
>>>>
>>>> On Thu, Feb 2, 2017 at 1:02 PM, Michael Gummelt <mgummelt@mesosphere.io
>>>> > wrote:
>>>>
>>>>> It sounds like you've answered your own question, right?
>>>>> --executor-memory means the memory per executor.  If you have no executor
>>>>> w/ 200GB memory, then the driver will accept no offers.
>>>>>
>>>>> On Thu, Feb 2, 2017 at 1:01 PM, Ji Yan <jiyan@drive.ai> wrote:
>>>>>
>>>>>> sorry, to clarify, i was using --executor-memory for memory,
>>>>>> and --total-executor-cores for cpu cores
>>>>>>
>>>>>> On Thu, Feb 2, 2017 at 12:56 PM, Michael Gummelt <
>>>>>> mgummelt@mesosphere.io> wrote:
>>>>>>
>>>>>>> What CLI args are your referring to?  I'm aware of spark-submit's
>>>>>>> arguments (--executor-memory, --total-executor-cores, and --executor-cores)
>>>>>>>
>>>>>>> On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <jiyan@drive.ai>
wrote:
>>>>>>>
>>>>>>>> I have done a experiment on this today. It shows that only
CPUs are
>>>>>>>> tolerant of insufficient cluster size when a job starts.
On my cluster, I
>>>>>>>> have 180Gb of memory and 64 cores, when I run spark-submit
( on mesos )
>>>>>>>> with --cpu_cores set to 1000, the job starts up with 64 cores.
but when I
>>>>>>>> set --memory to 200Gb, the job fails to start with "Initial
job
>>>>>>>> has not accepted any resources; check your cluster UI to
ensure that
>>>>>>>> workers are registered and have sufficient resources"
>>>>>>>>
>>>>>>>> Also it is confusing to me that --cpu_cores specifies the
number of
>>>>>>>> cpu cores across all executors, but --memory specifies per
executor memory
>>>>>>>> requirement.
>>>>>>>>
>>>>>>>> On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <
>>>>>>>> mgummelt@mesosphere.io> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <jiyan@drive.ai>
wrote:
>>>>>>>>>
>>>>>>>>>> Tasks begin scheduling as soon as the first executor
comes up
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks all for the clarification. Is this the default
behavior of
>>>>>>>>>> Spark on Mesos today? I think this is what we are
looking for because
>>>>>>>>>> sometimes a job can take up lots of resources and
later jobs could not get
>>>>>>>>>> all the resources that it asks for. If a Spark job
starts with only a
>>>>>>>>>> subset of resources that it asks for, does it know
to expand its resources
>>>>>>>>>> later when more resources become available?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Launch each executor with at least 1GB RAM, but if
mesos offers
>>>>>>>>>>> 2GB at some moment, then launch an executor with
2GB RAM
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This is less useful in our use case. But I am also
quite
>>>>>>>>>> interested in cases in which this could be helpful.
I think this will also
>>>>>>>>>> help with overall resource utilization on the cluster
if when another job
>>>>>>>>>> starts up that has a hard requirement on resources,
the extra resources to
>>>>>>>>>> the first job can be flexibly re-allocated to the
second job.
>>>>>>>>>>
>>>>>>>>>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt
<
>>>>>>>>>> mgummelt@mesosphere.io> wrote:
>>>>>>>>>>
>>>>>>>>>>> We've talked about that, but it hasn't become
a priority because
>>>>>>>>>>> we haven't had a driving use case.  If anyone
has a good argument for
>>>>>>>>>>> "variable" resource allocation like this, please
let me know.
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <
>>>>>>>>>>> linshuai2012@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> An alternative behavior is to launch the
job with the best
>>>>>>>>>>>>> resource offer Mesos is able to give
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Michael has just made an excellent explanation
about dynamic
>>>>>>>>>>>> allocation support in mesos. But IIUC, what
you want to achieve is
>>>>>>>>>>>> something like (using RAM as an example)
: "Launch each executor with at
>>>>>>>>>>>> least 1GB RAM, but if mesos offers 2GB at
some moment, then launch an
>>>>>>>>>>>> executor with 2GB RAM".
>>>>>>>>>>>>
>>>>>>>>>>>> I wonder what's benefit of that? To reduce
the "resource
>>>>>>>>>>>> fragmentation"?
>>>>>>>>>>>>
>>>>>>>>>>>> Anyway, that is not supported at this moment.
In all the
>>>>>>>>>>>> supported cluster managers of spark (mesos,
yarn, standalone, and the
>>>>>>>>>>>> up-to-coming spark on kubernetes), you have
to specify the cores and memory
>>>>>>>>>>>> of each executor.
>>>>>>>>>>>>
>>>>>>>>>>>> It may not be supported in the future, because
only mesos has
>>>>>>>>>>>> the concepts of offers because of its two-level
scheduling model.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <jiyan@drive.ai>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Spark Users,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Currently is there a way to dynamically
allocate resources to
>>>>>>>>>>>>> Spark on Mesos? Within Spark we can specify
the CPU cores, memory before
>>>>>>>>>>>>> running job. The way I understand is
that the Spark job will not run if the
>>>>>>>>>>>>> CPU/Mem requirement is not met. This
may lead to decrease in overall
>>>>>>>>>>>>> utilization of the cluster. An alternative
behavior is to launch the job
>>>>>>>>>>>>> with the best resource offer Mesos is
able to give. Is this possible with
>>>>>>>>>>>>> the current implementation?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Ji
>>>>>>>>>>>>>
>>>>>>>>>>>>> The information in this email is confidential
and may be
>>>>>>>>>>>>> legally privileged. It is intended solely
for the addressee. Access to this
>>>>>>>>>>>>> email by anyone else is unauthorized.
If you are not the intended
>>>>>>>>>>>>> recipient, any disclosure, copying, distribution
or any action taken or
>>>>>>>>>>>>> omitted to be taken in reliance on it,
is prohibited and may be unlawful.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Michael Gummelt
>>>>>>>>>>> Software Engineer
>>>>>>>>>>> Mesosphere
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The information in this email is confidential and
may be legally
>>>>>>>>>> privileged. It is intended solely for the addressee.
Access to this email
>>>>>>>>>> by anyone else is unauthorized. If you are not the
intended recipient, any
>>>>>>>>>> disclosure, copying, distribution or any action taken
or omitted to be
>>>>>>>>>> taken in reliance on it, is prohibited and may be
unlawful.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Michael Gummelt
>>>>>>>>> Software Engineer
>>>>>>>>> Mesosphere
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The information in this email is confidential and may be
legally
>>>>>>>> privileged. It is intended solely for the addressee. Access
to this email
>>>>>>>> by anyone else is unauthorized. If you are not the intended
recipient, any
>>>>>>>> disclosure, copying, distribution or any action taken or
omitted to be
>>>>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Michael Gummelt
>>>>>>> Software Engineer
>>>>>>> Mesosphere
>>>>>>>
>>>>>>
>>>>>>
>>>>>> The information in this email is confidential and may be legally
>>>>>> privileged. It is intended solely for the addressee. Access to this
email
>>>>>> by anyone else is unauthorized. If you are not the intended recipient,
any
>>>>>> disclosure, copying, distribution or any action taken or omitted
to be
>>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Michael Gummelt
>>>>> Software Engineer
>>>>> Mesosphere
>>>>>
>>>>
>>>>
>>>> The information in this email is confidential and may be legally
>>>> privileged. It is intended solely for the addressee. Access to this email
>>>> by anyone else is unauthorized. If you are not the intended recipient, any
>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>
>>>
>>>
>>>
>>> --
>>> Michael Gummelt
>>> Software Engineer
>>> Mesosphere
>>>
>>
>>
>> The information in this email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful.
>>
>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>



-- 
iPhone. iTypos. iApologize.

Mime
View raw message