spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Subramanian <sanjaysubraman...@yahoo.com.INVALID>
Subject Re: Extracting values from a Collecion
Date Fri, 21 Nov 2014 18:53:19 GMT
I am sorry the last line in the code is 
file1Rdd.join(file2RddGrp.mapValues(names => names.toSet)).collect().foreach(println)
so 
My Code=======val file1Rdd = sc.textFile("/Users/sansub01/mycode/data/songs/names.txt").map(x
=> (x.split(",")(0), x.split(",")(1)))val file2Rdd = sc.textFile("/Users/sansub01/mycode/data/songs/songs.txt").map(x
=> (x.split(",")(0), x.split(",")(1)))val file2RddGrp = file2Rdd.groupByKey()file1Rdd.join(file2RddGrp.mapValues(names
=> names.toSet)).collect().foreach(println)
Result=======(4,(ringo,Set(With a Little Help From My Friends, Octopus's Garden)))(2,(john,Set(Julia,
Nowhere Man)))(3,(george,Set(While My Guitar Gently Weeps, Norwegian Wood)))(1,(paul,Set(Yesterday,
Michelle)))
Again the question is how do I extract values from the Set ?
thanks
sanjay      From: Sanjay Subramanian <sanjaysubramanian@yahoo.com.INVALID>
 To: Arun Ahuja <aahuja11@gmail.com>; Andrew Ash <andrew@andrewash.com> 
Cc: user <user@spark.apache.org> 
 Sent: Friday, November 21, 2014 10:41 AM
 Subject: Extracting values from a Collecion
   
hey guys
names.txt========= 1,paul2,john3,george4,ringo 

songs.txt========= 1,Yesterday2,Julia3,While My Guitar Gently Weeps4,With a Little Help From
My Friends1,Michelle2,Nowhere Man3,Norwegian Wood4,Octopus's Garden
What I want to do is real simple 
Desired Output ==============(4,(With a Little Help From My Friends, Octopus's Garden))(2,(Julia,
Nowhere Man))(3,(While My Guitar Gently Weeps, Norwegian Wood))(1,(Yesterday, Michelle))

My Code=======val file1Rdd = sc.textFile("/Users/sansub01/mycode/data/songs/names.txt").map(x
=> (x.split(",")(0), x.split(",")(1)))val file2Rdd = sc.textFile("/Users/sansub01/mycode/data/songs/songs.txt").map(x
=> (x.split(",")(0), x.split(",")(1)))val file2RddGrp = file2Rdd.groupByKey()file2Rdd.groupByKey().mapValues(names
=> names.toSet).collect().foreach(println)

Result=======(4,Set(With a Little Help From My Friends, Octopus's Garden))(2,Set(Julia, Nowhere
Man))(3,Set(While My Guitar Gently Weeps, Norwegian Wood))(1,Set(Yesterday, Michelle))

How can I extract values from the Set ?


Thanks
sanjay


  
Mime
View raw message