spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (Jira)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-31590) Metadata-only queries should not include subquery in partition filters
Date Wed, 06 May 2020 01:58:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-31590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hyukjin Kwon reassigned SPARK-31590:
------------------------------------

    Assignee: dzcxzl

> Metadata-only queries should not include subquery in partition filters
> ----------------------------------------------------------------------
>
>                 Key: SPARK-31590
>                 URL: https://issues.apache.org/jira/browse/SPARK-31590
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: dzcxzl
>            Assignee: dzcxzl
>            Priority: Trivial
>
> When using SPARK-23877, some sql execution errors.
> code:
> {code:scala}
>         sql("set spark.sql.optimizer.metadataOnly=true")
>         sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET PARTITIONED
BY (d ,h)")
>         sql("""
>             |INSERT OVERWRITE TABLE test_tbl PARTITION(d,h)
>             |SELECT 1,'2020-01-01','23'
>             |UNION ALL
>             |SELECT 2,'2020-01-02','01'
>             |UNION ALL
>             |SELECT 3,'2020-01-02','02'
>             """.stripMargin)
>         sql(
>           s"""
>              |SELECT d, MAX(h) AS h
>              |FROM test_tbl
>              |WHERE d= (
>              |  SELECT MAX(d) AS d
>              |  FROM test_tbl
>              |)
>              |GROUP BY d
>         """.stripMargin).collect()
> {code}
> Exception:
> {code:java}
> java.lang.UnsupportedOperationException: Cannot evaluate expression: scalar-subquery#48
[]
> ...
> at org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.prunePartitions(PartitioningAwareFileIndex.scala:180)
> {code}
> optimizedPlan:
> {code:java}
> Aggregate [d#245], [d#245, max(h#246) AS h#243]
> +- Project [d#245, h#246]
>    +- Filter (isnotnull(d#245) AND (d#245 = scalar-subquery#242 []))
>       :  +- Aggregate [max(d#245) AS d#241]
>       :     +- LocalRelation <empty>, [d#245]
>       +- Relation[a#244,d#245,h#246] parquet
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message