spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Spark tunning increase number of active tasks
Date Sat, 31 Oct 2015 11:29:58 GMT
Maybe Hortonworks support can help you much better.

Otherwise you may want to change the yarn scheduler configuration and preemption. Do you use
something like speculative execution?

How do you start execution of the programs? Maybe you are already using all cores of the master...

> On 30 Oct 2015, at 23:32, YI, XIAOCHUAN <xy1267@att.com> wrote:
> 
> Hi
> Our team has a 40 node hortonworks Hadoop cluster 2.2.4.2-2  (36 data node) with apache
spark 1.2 and 1.4 installed.
> Each node has 64G RAM and 8 cores.
>  
> We are only able to use <= 72 executors with executor-cores=2
> So we are only get 144 active tasks running pyspark programs with pyspark.
> [Stage 1:===============>                                    (596 + 144) / 2042]
> IF we use larger number for --num-executors, the pyspark program exit with errors:
> ERROR YarnScheduler: Lost executor 113 on hag017.example.com: remote Rpc client disassociated
>  
> I tried spark 1.4 and conf.set("dynamicAllocation.enabled", "true"). However it does
not help us to increase the number of active tasks.
> I expect larger number of active tasks with the cluster we have.
> Could anyone advise on this? Thank you very much!
>  
> Shaun
>  

Mime
View raw message