spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammed Guller <moham...@glassbeam.com>
Subject RE: convert SQL multiple Join in Spark
Date Thu, 03 Mar 2016 22:08:30 GMT
Why not use Spark SQL?

Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Vikash Kumar [mailto:vikashspark@gmail.com]
Sent: Wednesday, March 2, 2016 8:29 PM
To: user@spark.apache.org
Subject: convert SQL multiple Join in Spark


I have to write or convert below SQL query into spark/scala. Anybody can suggest how to implement
this in Spark?

SELECT a.PERSON_ID as RETAINED_PERSON_ID,

                                a.PERSON_ID,

                                a.PERSONTYPE,

                                'y' as HOLDOUT,

                                d.LOCATION,

                                b.HHID,

                                a.AGE_OUTPUT as AGE,

                                a.FIRST_NAME,

                                d.STARTDATE,

                                d.ENDDATE,

                                'Not In Campaign' as HH_TYPE

                                FROM PERSON_MASTER_VIEW a

                                    INNER JOIN PERSON_ADDRESS_HH_KEYS b

                                        on a.PERSON_ID = b.PERSON_ID

                                    LEFT JOIN #Holdouts c

                                        on a.PERSON_ID = c.RETAINED_PERSON_ID

                                    INNER JOIN #Holdouts d

                                        on b.HHID = d.HHID

                                WHERE c.RETAINED_PERSON_ID IS NULL and a.PERSONTYPE IS NOT
NULL

                                GROUP BY a.PERSON_ID, a.PERSONTYPE, b.HHID, a.AGE_OUTPUT,
a.FIRST_NAME, d.LOCATION, d.STARTDATE, d.ENDDATE
Mime
View raw message