spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pralabh Kumar <pralabhku...@gmail.com>
Subject Hive on Spark vs Spark on Hive(HiveContext)
Date Thu, 01 Jul 2021 09:51:17 GMT
Hi Dev

I am having thousands of legacy hive queries .  As a plan to move to Spark
, we are planning to migrate Hive queries on Spark .  Now there are two
approaches


   1.  One is Hive on Spark , which is similar to changing the execution
   engine in hive queries like TEZ.
   2. Another one is migrating hive queries to Hivecontext/sparksql , an
   approach used by Facebook and presented in Spark conference.
   https://databricks.com/session/experiences-migrating-hive-workload-to-sparksql#:~:text=Spark%20SQL%20in%20Apache%20Spark,SQL%20with%20minimal%20user%20intervention
   .


Can you please guide me which option to go for . I am personally inclined
to go for option 2 . It also allows the use of the latest spark .

Please help me on the same , as there are not much comparisons online
available keeping Spark 3.0 in perspective.

Regards
Pralabh Kumar

Mime
View raw message