spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Mellott <kevin.r.mell...@gmail.com>
Subject Flattening Data within DataFrames
Date Mon, 29 Feb 2016 16:54:57 GMT
Fellow Sparkers,

I'm trying to "flatten" my view of data within a DataFrame, and am having
difficulties doing so. The DataFrame contains product information, which
includes multiple levels of categories (primary, secondary, etc).

*Example Data (Raw):*
*Name                    Level            Category*
Baked Code            Food             1
Baked Code            Seafood         2
Baked Code            Fish               3
Hockey Stick          Sports            1
Hockey Stick          Hockey          2
Hockey Stick          Equipment      3

*Desired Data:*
*Name                    Category1     Category2     Category3*
Baked Code            Food              Seafood         Fish
Hockey Stick          Sports            Hockey          Equipment

*Approach:*
After parsing the "raw" information into two separate DataFrames
(called *products
*and *categories*) and registering them as a Spark SQL tables, I was
attempting to perform the following query to flatten this all into the
"desired data" (depicted above).

products.registerTempTable("products")
categories.registerTempTable("categories")

val productList = sqlContext.sql(
  " SELECT p.Name, " +
  " c1.Description AS Category1, " +
  " c2.Description AS Category2, " +
  " c3.Description AS Category3 " +
  " FROM products AS p " +
  "   JOIN categories AS c1 " +
  "     ON c1.Name = p.Name AND c1.Level = '1' "
  "   JOIN categories AS c2 " +
  "     ON c2.Name = p.Name AND c2.Level = '2' "
  "   JOIN categories AS c3 " +
  "     ON c3.Name = p.Name AND c3.Level = '3' "

*Issue:*
I get an error when running my query above, because I am not able to JOIN
the *categories* table more than once. Has anybody dealt with this type of
use case before, and if so how did you achieve the desired behavior?

Thank you in advance for your thoughts.

Kevin

Mime
View raw message