spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xi Shen <davidshe...@gmail.com>
Subject Re: How to do nested foreach with RDD
Date Sun, 22 Mar 2015 11:03:36 GMT
Hi Reza,

Yes, I just found RDD.cartesian(). Very useful.

Thanks,
David


On Sun, Mar 22, 2015 at 5:08 PM Reza Zadeh <reza@databricks.com> wrote:

> You can do this with the 'cartesian' product method on RDD. For example:
>
> val rdd1 = ...
> val rdd2 = ...
>
> val combinations = rdd1.cartesian(rdd2).filter{ case (a,b) => a < b }
>
> Reza
>
> On Sat, Mar 21, 2015 at 10:37 PM, Xi Shen <davidshen84@gmail.com> wrote:
>
>> Hi,
>>
>> I have two big RDD, and I need to do some math against each pair of them.
>> Traditionally, it is like a nested for-loop. But for RDD, it cause a nested
>> RDD which is prohibited.
>>
>> Currently, I am collecting one of them, then do a nested for-loop, so to
>> avoid nested RDD. But would like to know if there's spark-way to do this.
>>
>>
>> Thanks,
>> David
>>
>>
>

Mime
View raw message