spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: flattening a list in spark sql
Date Tue, 02 Sep 2014 22:21:00 GMT
Check out LATERAL VIEW explode:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView


On Tue, Sep 2, 2014 at 1:26 PM, gtinside <gtinside@gmail.com> wrote:

> Hi ,
>
> I am using jsonRDD in spark sql and having trouble iterating through array
> inside the json object. Please refer to the schema below :
>
> -- Preferences: struct (nullable = true)
>  |    |-- destinations: array (nullable = true)
>  |-- user: string (nullable = true)
>
> Sample Data:
>
> -- Preferences: struct (nullable = true)
>  |    |-- destinations: ("Paris","NYC","LA","EWR")
>  |-- user: "test1"
>
> -- Preferences: struct (nullable = true)
>  |    |-- destinations: ("Paris","SFO")
>  |-- user: "test2"
>
>
> My requirement is to run query for displaying number of user per
> destination
> as follows :
>
> Number of users:10, Destination:Paris
> Number of users:20, Destination:NYC
> Number of users:30, Destination:SFO
>
> To achieve the above mentioned result, I need to flatten out the
> destinations array, but I am not sure how to do it. Can you please help ?
>
> Gaurav
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message