spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "宋源栋" <yuandong.s...@greatopensource.com>
Subject Spark is only using one worker machine when more are available
Date Wed, 11 Apr 2018 09:10:05 GMT


Hi all,
I hava a standalone mode spark cluster without HDFS with 10 machines that each one has 40
cpu cores and 128G RAM.
My application is a sparksql application that reads data from database "tpch_100g" in mysql
and run tpch queries. When loading tables from myql to spark, I spilts the biggest table "lineitem"
into 600 partitions. 

When my application runs, there are only 40 executor(spark.executor.memory = 1g, spark.executor.cores
= 1) in executor page of spark application web and all executors are on the same mathine.
It is too slowly that all tasks are parallelly running in only one mathine.



Mime
View raw message