spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Spark on YARN
Date Wed, 19 Nov 2014 22:14:04 GMT
I think your config may be the issue then. It sounds like 1 server is
configured in a different YARN group that states it has way less
resource than it does.

On Wed, Nov 19, 2014 at 5:27 PM, Alan Prando <alan@scanboo.com.br> wrote:
> Hi all!
>
> Thanks for answering!
>
> @Sean, I tried to run with 30 executor-cores , and 1 machine still without
> processing.
> @Vanzin, I checked RM's web UI, and all nodes were detecteds and "RUNNING".
> The interesting fact is that available
> memory and available core of 1 node was different of other 2, with just 1
> available core and 1 available gig ram.
>
> @All, I created a new cluster with 10 slaves and 1 master, and now 9 of my
> slaves are working, and 1 still without processing.
>
> It's fine by me! I'm just wondering why YARN's doing it... Does anyone know
> the answer?
>
> 2014-11-18 16:18 GMT-02:00 Sean Owen <sowen@cloudera.com>:
>
>> My guess is you're asking for all cores of all machines but the driver
>> needs at least one core, so one executor is unable to find a machine to fit
>> on.
>>
>> On Nov 18, 2014 7:04 PM, "Alan Prando" <alan@scanboo.com.br> wrote:
>>>
>>> Hi Folks!
>>>
>>> I'm running Spark on YARN cluster installed with Cloudera Manager
>>> Express.
>>> The cluster has 1 master and 3 slaves, each machine with 32 cores and 64G
>>> RAM.
>>>
>>> My spark's job is working fine, however it seems that just 2 of 3 slaves
>>> are working (htop shows 2 slaves working 100% on 32 cores, and 1 slaves
>>> without any processing).
>>>
>>> I'm using this command:
>>> ./spark-submit --master yarn --num-executors 3 --executor-cores 32
>>> --executor-memory 32g feature_extractor.py -r 390
>>>
>>> Additionaly, spark's log testify communications with 2 slaves only:
>>> 14/11/18 17:19:38 INFO YarnClientSchedulerBackend: Registered executor:
>>> Actor[akka.tcp://sparkExecutor@ip-172-31-13-180.ec2.internal:33177/user/Executor#-113177469]
>>> with ID 1
>>> 14/11/18 17:19:38 INFO RackResolver: Resolved
>>> ip-172-31-13-180.ec2.internal to /default
>>> 14/11/18 17:19:38 INFO YarnClientSchedulerBackend: Registered executor:
>>> Actor[akka.tcp://sparkExecutor@ip-172-31-13-179.ec2.internal:51859/user/Executor#-323896724]
>>> with ID 2
>>> 14/11/18 17:19:38 INFO RackResolver: Resolved
>>> ip-172-31-13-179.ec2.internal to /default
>>> 14/11/18 17:19:38 INFO BlockManagerMasterActor: Registering block manager
>>> ip-172-31-13-180.ec2.internal:50959 with 16.6 GB RAM
>>> 14/11/18 17:19:39 INFO BlockManagerMasterActor: Registering block manager
>>> ip-172-31-13-179.ec2.internal:53557 with 16.6 GB RAM
>>> 14/11/18 17:19:51 INFO YarnClientSchedulerBackend: SchedulerBackend is
>>> ready for scheduling beginning after waiting
>>> maxRegisteredResourcesWaitingTime: 30000(ms)
>>>
>>> Is there a configuration to call spark's job on YARN cluster with all
>>> slaves?
>>>
>>> Thanks in advance! =]
>>>
>>> ---
>>> Regards
>>> Alan Vidotti Prando.
>>>
>>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message