spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Spitzer <russell.spit...@gmail.com>
Subject Reserved Words in Spark SQL as TableAliases
Date Mon, 19 Mar 2018 23:02:20 GMT
I found
https://issues.apache.org/jira/browse/SPARK-20964

but currently it seems like strictIdentifiers are allowed to contain any
reserved key words

https://github.com/apache/spark/blob/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L501-L503

https://github.com/apache/spark/blob/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L703-L707



For example both of these things are valid

scala> spark.sql("SELECT MAX(id) FROM FUN GROUP BY id").show
scala> spark.sql("SELECT MAX(id) FROM FUN WHERE GROUP BY id").show

== Parsed Logical Plan ==
'Aggregate ['id], [unresolvedalias('MAX('id), None)]
+- 'SubqueryAlias WHERE
   +- 'UnresolvedRelation `FUN`

Because the second reference allows "WHERE" as an identifier for the Table
Identifier. This could allow some unintended SQL. I think it might make
sense to allow for reserved key words but only if they are actually escaped.

I think that strictIdentifier should not allow reserved words but I perhaps
am missing some history on this.

Mime
View raw message