spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Shtelma <>
Subject Re: Accessing the SQL parser
Date Fri, 12 Jan 2018 11:09:14 GMT
Hi AbdealiJK,

In order to get AST you can parse your query with Spark Parser :

LogicalPlan logicalPlan =
sparkSession.sessionState().sqlParser().parsePlan("select * from

Afterwards you can implement your custom logic and execute it in this way:

Dataset<Row> ds = Dataset.ofRows(sparkSession, logicalPlan);;

Alternatively you can manually run resolve and optimize the plan and
maybe do smth else afterwards:

QueryExecution queryExecution =
SparkPlan plan = queryExecution.executedPlan();
RDD<InternalRow> rdd = plan.execute();
System.out.println("rdd.count() = " + rdd.count());


On Fri, Jan 12, 2018 at 5:39 AM, Abdeali Kothari
<> wrote:
> I was writing some code to try to auto find a list of tables and databases
> being used in a SparkSQL query. Mainly I was looking to auto-check the
> permissions and owners of all the tables a query will be trying to access.
> I was wondering whether PySpark has some method for me to directly use the
> AST that SparkSQL uses?
> Or is there some documentation on how I can generate and understand the AST
> in Spark?
> Regards,
> AbdealiJK

To unsubscribe e-mail:

View raw message