spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gtinside <>
Subject flattening a list in spark sql
Date Tue, 02 Sep 2014 20:26:15 GMT
Hi ,

I am using jsonRDD in spark sql and having trouble iterating through array
inside the json object. Please refer to the schema below :

-- Preferences: struct (nullable = true)
 |    |-- destinations: array (nullable = true)
 |-- user: string (nullable = true)

Sample Data:

-- Preferences: struct (nullable = true)
 |    |-- destinations: ("Paris","NYC","LA","EWR")
 |-- user: "test1"

-- Preferences: struct (nullable = true)
 |    |-- destinations: ("Paris","SFO")
 |-- user: "test2"

My requirement is to run query for displaying number of user per destination
as follows :

Number of users:10, Destination:Paris
Number of users:20, Destination:NYC
Number of users:30, Destination:SFO

To achieve the above mentioned result, I need to flatten out the
destinations array, but I am not sure how to do it. Can you please help ?


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message