spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Shtelma <mshte...@gmail.com>
Subject Re: Accessing the SQL parser
Date Fri, 12 Jan 2018 11:09:14 GMT
Hi AbdealiJK,

In order to get AST you can parse your query with Spark Parser :

LogicalPlan logicalPlan =
sparkSession.sessionState().sqlParser().parsePlan("select * from
myTable");

Afterwards you can implement your custom logic and execute it in this way:

Dataset<Row> ds = Dataset.ofRows(sparkSession, logicalPlan);
ds.show();

Alternatively you can manually run resolve and optimize the plan and
maybe do smth else afterwards:

QueryExecution queryExecution =
sparkSession.sessionState().executePlan(logicalPlan);
SparkPlan plan = queryExecution.executedPlan();
RDD<InternalRow> rdd = plan.execute();
System.out.println("rdd.count() = " + rdd.count());

Best,
Michael


On Fri, Jan 12, 2018 at 5:39 AM, Abdeali Kothari
<abdealikothari@gmail.com> wrote:
> I was writing some code to try to auto find a list of tables and databases
> being used in a SparkSQL query. Mainly I was looking to auto-check the
> permissions and owners of all the tables a query will be trying to access.
>
> I was wondering whether PySpark has some method for me to directly use the
> AST that SparkSQL uses?
>
> Or is there some documentation on how I can generate and understand the AST
> in Spark?
>
> Regards,
> AbdealiJK
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message