spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "YI, XIAOCHUAN" <xy1...@att.com>
Subject RE: Spark tunning increase number of active tasks
Date Fri, 30 Oct 2015 22:32:34 GMT
Hi
Our team has a 40 node hortonworks Hadoop cluster 2.2.4.2-2  (36 data node) with apache spark
1.2 and 1.4 installed.
Each node has 64G RAM and 8 cores.

We are only able to use <= 72 executors with executor-cores=2
So we are only get 144 active tasks running pyspark programs with pyspark.
[Stage 1:===============>                                    (596 + 144) / 2042]
IF we use larger number for --num-executors, the pyspark program exit with errors:
ERROR YarnScheduler: Lost executor 113 on hag017.example.com: remote Rpc client disassociated

I tried spark 1.4 and conf.set("dynamicAllocation.enabled", "true"). However it does not help
us to increase the number of active tasks.
I expect larger number of active tasks with the cluster we have.
Could anyone advise on this? Thank you very much!

Shaun


Mime
View raw message