spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-28195) CheckAnalysis not working for Command and report misleading error message
Date Mon, 01 Jul 2019 04:22:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-28195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-28195:
------------------------------------

    Assignee:     (was: Apache Spark)

> CheckAnalysis not working for Command and report misleading error message
> -------------------------------------------------------------------------
>
>                 Key: SPARK-28195
>                 URL: https://issues.apache.org/jira/browse/SPARK-28195
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.2
>            Reporter: liupengcheng
>            Priority: Major
>
> Currently, we encountered an issue when executing `InsertIntoDataSourceDirCommand`, and
we found that it's query relied on non-exist table or view, but we finally got a misleading
error message:
> {code:java}
> Caused by: org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to
dataType on unresolved object, tree: 'kr.objective_id
> at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.dataType(unresolved.scala:105)
> at org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440)
> at org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.immutable.List.map(List.scala:285)
> at org.apache.spark.sql.types.StructType$.fromAttributes(StructType.scala:440)
> at org.apache.spark.sql.catalyst.plans.QueryPlan.schema$lzycompute(QueryPlan.scala:159)
> at org.apache.spark.sql.catalyst.plans.QueryPlan.schema(QueryPlan.scala:159)
> at org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:544)
> at org.apache.spark.sql.execution.command.InsertIntoDataSourceDirCommand.run(InsertIntoDataSourceDirCommand.scala:70)
> at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
> at org.apache.spark.sql.execution.adaptive.QueryStage.executeCollect(QueryStage.scala:246)
> at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
> at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
> at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3277)
> at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
> at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3276)
> at org.apache.spark.sql.Dataset.&lt;init&gt;(Dataset.scala:190)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
> at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
> at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:277)
> ... 11 more
> {code}
> After looking into the code, I found that it's because we support `runSQLOnFiles` feature
since 2.3, and if the table does not exist and it's not a temporary table, then It will be
treated as running directly on files.
> `ResolveSQLOnFile` rule will analyze it, and return an `UnresolvedRelation` on resolve
failure(it's actually not a sql on files, so it will fail when resolving). Due to Command
has empty children, `CheckAnalysis` will skip check the `UnresolvedRelation` and finally
we got the above misleading error message when executing this command.
> I think maybe we should checkAnalysis for command's query plan? Or is there any consideration
for not checking analysis for command?
> Seems this issue still exists in master branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message