spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daimon <>
Subject Re: Spark FAIR Scheduler vs FIFO Scheduler
Date Tue, 19 Jun 2018 14:16:30 GMT
Spark will schedule all jobs you have and add them to common task queue.
Difference between FIFO and FAIR is how this queue is handled. FIFO will
prefer to run jobs in FIFO order and FAIR would try to divide resources
equally to all jobs

Problem you have is different. Driver (actually spark API) blocks on certain
actions like writing. So what actually happens is that driver does not
process further flow until it finishes that write call so it does not even
have opportunity to schedule additional jobs.

To run multiple writes concurrently on spark you actually need to call spark
API in multiple threads (using thread pool or futures)

This way you will be able to run multiple writes concurrently on driver
which will add all related jobs/tasks to common queue.

At this point you can decide whether you want FIFO or FAIR. In some cases
(because of data locality) FAIR scheduler can produce quicker overall

Sent from:

To unsubscribe e-mail:

View raw message