spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pankaj Narang <>
Subject Re: Finding most occurrences in a JSON Nested Array
Date Mon, 05 Jan 2015 16:17:50 GMT
try as below => row(1)).collect


var hobbies = results.flatMap(row => row(1))

It will create all the hobbies in a simpe array nowob

hbmap =>(hobby,1)).reduceByKey((hobcnt1,hobcnt2)

It will aggregate  hobbies as below

{swimming,2}, {hiking,1}

Now hbmap .map{case(hobby,count)=>(count,hobby)}.sortByKey(ascending

will give you hobbies sorted in descending by their count
This is pseudo code and must help you


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message