spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Subramanian <>
Subject Extracting values from a Collecion
Date Fri, 21 Nov 2014 18:41:38 GMT
hey guys
names.txt========= 1,paul2,john3,george4,ringo 

songs.txt========= 1,Yesterday2,Julia3,While My Guitar Gently Weeps4,With a Little Help From
My Friends1,Michelle2,Nowhere Man3,Norwegian Wood4,Octopus's Garden
What I want to do is real simple 
Desired Output ==============(4,(With a Little Help From My Friends, Octopus's Garden))(2,(Julia,
Nowhere Man))(3,(While My Guitar Gently Weeps, Norwegian Wood))(1,(Yesterday, Michelle))

My Code=======val file1Rdd = sc.textFile("/Users/sansub01/mycode/data/songs/names.txt").map(x
=> (x.split(",")(0), x.split(",")(1)))val file2Rdd = sc.textFile("/Users/sansub01/mycode/data/songs/songs.txt").map(x
=> (x.split(",")(0), x.split(",")(1)))val file2RddGrp = file2Rdd.groupByKey()file2Rdd.groupByKey().mapValues(names
=> names.toSet).collect().foreach(println)

Result=======(4,Set(With a Little Help From My Friends, Octopus's Garden))(2,Set(Julia, Nowhere
Man))(3,Set(While My Guitar Gently Weeps, Norwegian Wood))(1,Set(Yesterday, Michelle))

How can I extract values from the Set ?

View raw message