In Both the cases, I am trying to create a HIVE table based on Union on 2 same queries.

Not sure how internally it differs on the process of creation of HIVE table?

Regards,
Neeraj

On Sun, Mar 31, 2019 at 1:29 PM Jörn Franke <jornfranke@gmail.com> wrote:
Is the select taking longer or the saving to a file. You seem to only save in the second case to a file 

Am 29.03.2019 um 15:10 schrieb neeraj bhadani <bhadani.neeraj.08@gmail.com>:

Hi Team,
   I am executing same spark code using the Spark SQL API and DataFrame API, however, Spark SQL is taking longer than expected.

PFB Sudo code.
-----------------------------------------------------------------------------------------------

Case 1 : Spark SQL

-----------------------------------------------------------------------------------------------

%sql

CREATE TABLE <tbl_name>

AS


 WITH <table_1> AS (

     <qry1>

)

,<table_2> AS (

     <qry2>

     )


SELECT * FROM <table_1> 

UNION ALL

SELECT * FROM <table_2>


-----------------------------------------------------------------------------------------------

Case  2 : DataFrame API

-----------------------------------------------------------------------------------------------


df1 = spark.sql(<qry1>)

df2 = spark.sql(<qry2>)

df3 = df1.union(df2)

df3.write.saveAsTable(<table_name>)

-----------------------------------------------------------------------------------------------


As per my understanding, both Spark SQL and DtaaFrame API generate the same code under the hood and execution time has to be similar.


Regards,

Neeraj