spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mail.com" <pradeep.mi...@mail.com>
Subject Num of executors and cores
Date Tue, 26 Jul 2016 00:18:49 GMT
Hi All,

I have a directory which has 12 files. I want to read the entire file so I am reading it as
wholeTextFiles(dirpath, numPartitions).

I run spark-submit as <all other stuff> --num-executors 12 --executor-cores 1 and numPartitions
12.

However, when I run the job I see that the stage which reads the directory has only 8 tasks.
So some task reads more than one file and takes twice the time.

What can I do that the files are read by 12 tasks  I.e one file per task.

Thanks,
Pradeep

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message