spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kapil Malik <kapil.ma...@snapdeal.com>
Subject Catalog, SessionCatalog and ExternalCatalog in spark 2.0
Date Sat, 03 Sep 2016 12:19:51 GMT
Hi all,

I have a Spark SQL 1.6 application in production which does following on
executing sqlContext.sql(...) -
1. Identify the table-name mentioned in query
2. Use an external database to decide where's the data located, in which
format (parquet or csv or jdbc) etc.
3. Load the dataframe
4. Register it as temp table (for future calls to this table)

This is achieved by extending HiveContext, and correspondingly HiveCatalog.
I have my own implementation of trait "Catalog", which over-rides the
"lookupRelation" method to do the magic behind the scenes.

However, in spark 2.0, I can see following -
SessionCatalog - which contains lookupRelation method, but doesn't have any
interface / abstract class to it.
ExternalCatalog - which deals with CatalogTable instead of Df / LogicalPlan.
Catalog - which also doesn't expose any method to lookup Df / LogicalPlan.

So apparently it looks like I need to extend SessionCatalog only.
However, just wanted to get a feedback on if there's a better / recommended
approach to achieve this.


Thanks and regards,


Kapil Malik
*Sr. Principal Engineer | Data Platform, Technology*
M: +91 8800836581 | T: 0124-4330000 | EXT: 20910
ASF Centre A | 1st Floor | Udyog Vihar Phase IV |
Gurgaon | Haryana | India

*Disclaimer:* This communication is for the sole use of the addressee and
is confidential and privileged information. If you are not the intended
recipient of this communication, you are prohibited from disclosing it and
are required to delete it forthwith. Please note that the contents of this
communication do not necessarily represent the views of Jasper Infotech
Private Limited ("Company"). E-mail transmission cannot be guaranteed to be
secure or error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The Company,
therefore, does not accept liability for any loss caused due to this
communication. *Jasper Infotech Private Limited, Registered Office: 1st
Floor, Plot 238, Okhla Industrial Estate, New Delhi - 110020 INDIA CIN:
U72300DL2007PTC168097*

Mime
View raw message