spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shixiong Zhu <zsxw...@gmail.com>
Subject Re: Eliminate copy while sending data : any Akka experts here ?
Date Fri, 21 Nov 2014 03:14:09 GMT
Is it possible that Spark buffers the messages
of mapOutputStatuses(Array[Byte]) according to the size
of mapOutputStatuses which have already sent but not yet ACKed? The buffer
will be cheap since the mapOutputStatuses messages are same and the memory
cost is only a few pointers.

Best Regards,
Shixiong Zhu

2014-09-20 16:24 GMT+08:00 Reynold Xin <rxin@databricks.com>:

> BTW - a partial solution here: https://github.com/apache/spark/pull/2470
>
> This doesn't address the 0 size block problem yet, but makes my large job
> on hundreds of terabytes of data much more reliable.
>
>
> On Fri, Jul 4, 2014 at 2:28 AM, Mridul Muralidharan <mridul@gmail.com>
> wrote:
>
> > In our clusters, number of containers we can get is high but memory
> > per container is low : which is why avg_nodes_not_hosting data is
> > rarely zero for ML tasks :-)
> >
> > To update - to unblock our current implementation efforts, we went
> > with broadcast - since it is intutively easier and minimal change; and
> > compress the array as bytes in TaskResult.
> > This is then stored in disk backed maps - to remove memory pressure on
> > master and workers (else MapOutputTracker becomes a memory hog).
> >
> > But I agree, compressed bitmap to represent 'large' blocks (anything
> > larger that maxBytesInFlight actually) and probably existing to track
> > non zero should be fine (we should not really track zero output for
> > reducer - just waste of space).
> >
> >
> > Regards,
> > Mridul
> >
> > On Fri, Jul 4, 2014 at 3:43 AM, Reynold Xin <rxin@databricks.com> wrote:
> > > Note that in my original proposal, I was suggesting we could track
> > whether
> > > block size = 0 using a compressed bitmap. That way we can still avoid
> > > requests for zero-sized blocks.
> > >
> > >
> > >
> > > On Thu, Jul 3, 2014 at 3:12 PM, Reynold Xin <rxin@databricks.com>
> wrote:
> > >
> > >> Yes, that number is likely == 0 in any real workload ...
> > >>
> > >>
> > >> On Thu, Jul 3, 2014 at 8:01 AM, Mridul Muralidharan <mridul@gmail.com
> >
> > >> wrote:
> > >>
> > >>> On Thu, Jul 3, 2014 at 11:32 AM, Reynold Xin <rxin@databricks.com>
> > wrote:
> > >>> > On Wed, Jul 2, 2014 at 3:44 AM, Mridul Muralidharan <
> > mridul@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> >>
> > >>> >> >
> > >>> >> > The other thing we do need is the location of blocks.
This is
> > >>> actually
> > >>> >> just
> > >>> >> > O(n) because we just need to know where the map was run.
> > >>> >>
> > >>> >> For well partitioned data, wont this not involve a lot of
unwanted
> > >>> >> requests to nodes which are not hosting data for a reducer
(and
> lack
> > >>> >> of ability to throttle).
> > >>> >>
> > >>> >
> > >>> > Was that a question? (I'm guessing it is). What do you mean
> exactly?
> > >>>
> > >>>
> > >>> I was not sure if I understood the proposal correctly - hence the
> > >>> query : if I understood it right - the number of wasted requests goes
> > >>> up by num_reducers * avg_nodes_not_hosting data.
> > >>>
> > >>> Ofcourse, if avg_nodes_not_hosting data == 0, then we are fine !
> > >>>
> > >>> Regards,
> > >>> Mridul
> > >>>
> > >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message