spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dillon Dukek <dillon.du...@placed.com.INVALID>
Subject Re: Spark on YARN not utilizing all the YARN containers available
Date Tue, 09 Oct 2018 20:04:34 GMT
I'm still not sure exactly what you are meaning by saying that you have 6
yarn containers. Yarn should just be aware of the total available resources
in  your cluster and then be able to launch containers based on the
executor requirements you set when you submit your job. If you can, I think
it would be helpful to send me the command you're using to launch your
spark process. You should also be able to use the logs and/or the spark UI
to determine how many executors are running.

On Tue, Oct 9, 2018 at 12:57 PM Gourav Sengupta <gourav.sengupta@gmail.com>
wrote:

> hi,
>
> may be I am not quite clear in my head on this one. But how do we know
> that 1 yarn container = 1 executor?
>
> Regards,
> Gourav Sengupta
>
> On Tue, Oct 9, 2018 at 8:53 PM Dillon Dukek
> <dillon.dukek@placed.com.invalid> wrote:
>
>> Can you send how you are launching your streaming process? Also what
>> environment is this cluster running in (EMR, GCP, self managed, etc)?
>>
>> On Tue, Oct 9, 2018 at 10:21 AM kant kodali <kanth909@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I am using Spark 2.3.1 and using YARN as a cluster manager.
>>>
>>> I currently got
>>>
>>> 1) 6 YARN containers(executors=6) with 4 executor cores for each
>>> container.
>>> 2) 6 Kafka partitions from one topic.
>>> 3) You can assume every other configuration is set to whatever the
>>> default values are.
>>>
>>> Spawned a Simple Streaming Query and I see all the tasks get scheduled
>>> on one YARN container. am I missing any config?
>>>
>>> Thanks!
>>>
>>

Mime
View raw message