spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dcmovva <dilip.mo...@gmail.com>
Subject Joining by values
Date Sat, 03 Jan 2015 18:10:24 GMT
I have a two pair RDDs in spark like this

rdd1 = (1 -> [4,5,6,7])
   (2 -> [4,5])
   (3 -> [6,7])


rdd2 = (4 -> [1001,1000,1002,1003])
   (5 -> [1004,1001,1006,1007])
   (6 -> [1007,1009,1005,1008])
   (7 -> [1011,1012,1013,1010])
I would like to combine them to look like this.

joinedRdd = (1 ->
[1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012,1013])
        (2 -> [1000,1001,1002,1003,1004,1006,1007])
        (3 -> [1005,1007,1008,1009,1010,1011,1012,1013])


Can someone suggest me how to do this.

Thanks Dilip



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Joining-by-values-tp20954.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message