spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mridul Muralidharan <mri...@gmail.com>
Subject Re: Eliminate copy while sending data : any Akka experts here ?
Date Thu, 03 Jul 2014 15:01:50 GMT
On Thu, Jul 3, 2014 at 11:32 AM, Reynold Xin <rxin@databricks.com> wrote:
> On Wed, Jul 2, 2014 at 3:44 AM, Mridul Muralidharan <mridul@gmail.com>
> wrote:
>
>>
>> >
>> > The other thing we do need is the location of blocks. This is actually
>> just
>> > O(n) because we just need to know where the map was run.
>>
>> For well partitioned data, wont this not involve a lot of unwanted
>> requests to nodes which are not hosting data for a reducer (and lack
>> of ability to throttle).
>>
>
> Was that a question? (I'm guessing it is). What do you mean exactly?


I was not sure if I understood the proposal correctly - hence the
query : if I understood it right - the number of wasted requests goes
up by num_reducers * avg_nodes_not_hosting data.

Ofcourse, if avg_nodes_not_hosting data == 0, then we are fine !

Regards,
Mridul

Mime
View raw message