spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
Subject How to increase the number of tasks
Date Fri, 05 Jun 2015 09:48:14 GMT
I have a  stage that spawns 174 tasks when i run repartition on avro data.
Tasks read between 512/317/316/214/173  MB of data. Even if i increase
number of executors/ number of partitions (when calling repartition) the
number of tasks launched remains fixed to 174.

1) I want to speed up this task. How do i do it ?
2) Few tasks finish in 20 mins, few in 15 and few in less than 10. Why is
this behavior ?
Since this is a repartition stage, it should not depend on the nature of
data.

Its taking more than 30 mins and i want to speed it up by throwing more
executors at it.

Please suggest

Deepak

Mime
View raw message