spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Marscher <rmarsc...@localytics.com>
Subject Re: Does RDD.cartesian involve shuffling?
Date Tue, 04 Aug 2015 16:30:29 GMT
That is the only alternative I'm aware of, if either A or B are small
enough to broadcast then you'd at least be done cartesian products all
locally without needing to also transmit and shuffle A. Unless spark
somehow optimizes cartesian product and only transfers the smaller RDD
across the network in the shuffle but I don't have reason to believe that's
true.

I'd try the cartesian first if you haven't tried at all, just to make sure
it actually is too slow before getting tricky with the broadcast.

On Tue, Aug 4, 2015 at 12:25 PM, Meihua Wu <rotationsymmetry14@gmail.com>
wrote:

> Thanks, Richard!
>
> I basically have two RDD's: A and B; and I need to compute a value for
> every pair of (a, b) for a in A and b in B. My first thought is
> cartesian, but involves expensive shuffle.
>
> Any alternatives? How about I convert B to an array and broadcast it
> to every node (assuming B is relative small to fit)?
>
>
>
> On Tue, Aug 4, 2015 at 8:23 AM, Richard Marscher
> <rmarscher@localytics.com> wrote:
> > Yes it does, in fact it's probably going to be one of the more expensive
> > shuffles you could trigger.
> >
> > On Mon, Aug 3, 2015 at 12:56 PM, Meihua Wu <rotationsymmetry14@gmail.com
> >
> > wrote:
> >>
> >> Does RDD.cartesian involve shuffling?
> >>
> >> Thanks!
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> >> For additional commands, e-mail: user-help@spark.apache.org
> >>
> >
> >
> >
> > --
> > Richard Marscher
> > Software Engineer
> > Localytics
> > Localytics.com | Our Blog | Twitter | Facebook | LinkedIn
>



-- 
*Richard Marscher*
Software Engineer
Localytics
Localytics.com <http://localytics.com/> | Our Blog
<http://localytics.com/blog> | Twitter <http://twitter.com/localytics> |
Facebook <http://facebook.com/localytics> | LinkedIn
<http://www.linkedin.com/company/1148792?trk=tyah>

Mime
View raw message