spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gtinside <gtins...@gmail.com>
Subject flattening a list in spark sql
Date Tue, 02 Sep 2014 20:26:15 GMT
Hi ,

I am using jsonRDD in spark sql and having trouble iterating through array
inside the json object. Please refer to the schema below :

-- Preferences: struct (nullable = true)
 |    |-- destinations: array (nullable = true)
 |-- user: string (nullable = true)

Sample Data:

-- Preferences: struct (nullable = true)
 |    |-- destinations: ("Paris","NYC","LA","EWR")
 |-- user: "test1"

-- Preferences: struct (nullable = true)
 |    |-- destinations: ("Paris","SFO")
 |-- user: "test2"


My requirement is to run query for displaying number of user per destination
as follows :

Number of users:10, Destination:Paris
Number of users:20, Destination:NYC
Number of users:30, Destination:SFO

To achieve the above mentioned result, I need to flatten out the
destinations array, but I am not sure how to do it. Can you please help ?

Gaurav




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message