I am facing the same problem(no of partitions were huge- cored 960, partitions - 16000). I tried to decrease the number of partitions with coalesce, but the problem is unbalanced data. After using coalesce, it gives me Java out of heap space error. There was no out of heap error without coalesce. I am guessing the error is due to uneven data and some heavy partitions getting merged together.
spark.conf.set("spark.sql.shuffle.partitions", "5" ) 

I noticed that in my spark application, the number of tasks in the first stage is equal to the number of files read by the application(at least for Avro) if the number of cpu cores is less than the number of files. Though If cpu cores are more than number of files, it's usually equal to default parallelism number. Why is it behave like this? Would this require a lot of resource from the driver? Is there any way we can do to decrease the number of tasks(partitions) in the first stage without merge files before loading? 


