spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Livni, Dana" <>
Subject RE: major Spark performance problem
Date Sun, 09 Mar 2014 12:56:51 GMT
YARN also have this scheduling option.
The problem is all of our applications have the same flow where the first  stage is the heaviest
and the rest are very small.
The problem is when some request (application) start to run on the same time, the first stage
of all is schedule in parallel, and for some reason they delay each other,
And a stage that alone will take around 13s can reach up to 2m when running in parallel with
other identic stages  (around 15 stages).

-----Original Message-----
From: elyast [] 
Sent: Friday, March 07, 2014 20:01
Subject: Re: major Spark performance problem


There is also an option to run spark applications on top of mesos in fine grained mode, then
it is possible for fair scheduling (applications will run in parallel and mesos is responsible
for scheduling all tasks) so in a sense all applications will progress in parallel, obviously
it total in may not be faster however the benefit is the fair scheduling (small jobs will
not be stuck by the big ones).

Best regards
Lukasz Jastrzebski

View this message in context:
Sent from the Apache Spark User List mailing list archive at
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

View raw message