spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuelin Cao.2015" <>
Subject Re: When will spark support "push" style shuffle?
Date Thu, 08 Jan 2015 06:25:43 GMT
Got it. The explain makes sense. Thank you.

On Thu, Jan 8, 2015 at 1:06 PM, Patrick Wendell [via Apache Spark
Developers List] <> wrote:

> This question is conflating a few different concepts. I think the main
> question is whether Spark will have a shuffle implementation that
> streams data rather than persisting it to disk/cache as a buffer.
> Spark currently decouples the shuffle write from the read using
> disk/OS cache as a buffer. The two benefits of this approach this are
> that it allows intra-query fault tolerance and it makes it easier to
> elastically scale and reschedule work within a job. We consider these
> to be design requirements (think about jobs that run for several hours
> on hundreds of machines). Impala, and similar systems like dremel and
> f1, not offer fault tolerance within a query at present. They also
> require gang scheduling the entire set of resources that will exist
> for the duration of a query.
> A secondary question is whether our shuffle should have a barrier or
> not. Spark's shuffle currently has a hard barrier between map and
> reduce stages. We haven't seen really strong evidence that removing
> the barrier is a net win. It can help the performance of a single job
> (modestly), but in the a multi-tenant workload, it leads to poor
> utilization since you have a lot of reduce tasks that are taking up
> slots waiting for mappers to finish. Many large scale users of
> Map/Reduce disable this feature in production clusters for that
> reason. Thus, we haven't seen compelling evidence for removing the
> barrier at this point, given the complexity of doing so.
> It is possible that future versions of Spark will support push-based
> shuffles, potentially in a mode that remove some of Spark's fault
> tolerance properties. But there are many other things we can still
> optimize about the shuffle that would likely come before this.
> - Patrick
> On Wed, Jan 7, 2015 at 6:01 PM, 曹雪林 <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=10029&i=0>> wrote:
> > Hi,
> >
> >       I've heard a lot of complain about spark's "pull" style shuffle.
> Is
> > there any plan to support "push" style shuffle in the near future?
> >
> >       Currently, the shuffle phase must be completed before the next
> stage
> > starts. While, it is said, in Impala, the shuffled data is "streamed" to
> > the next stage handler, which greatly saves time. Will spark support
> this
> > mechanism one day?
> >
> > Thanks
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> <http:///user/SendEmail.jtp?type=node&node=10029&i=1>
> For additional commands, e-mail: [hidden email]
> <http:///user/SendEmail.jtp?type=node&node=10029&i=2>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>  To start a new topic under Apache Spark Developers List, email
> To unsubscribe from Apache Spark Developers List, click here
> <>
> .
> <>

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message