spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ankit Raj Boudh (Jira)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-30298) bucket join cannot work for self-join with views
Date Thu, 19 Dec 2019 10:56:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-30298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999959#comment-16999959
] 

Ankit Raj Boudh commented on SPARK-30298:
-----------------------------------------

[~imback82] ok , Thank you for raising PR.

> bucket join cannot work for self-join with views
> ------------------------------------------------
>
>                 Key: SPARK-30298
>                 URL: https://issues.apache.org/jira/browse/SPARK-30298
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Xiaoju Wu
>            Priority: Minor
>
> This UT may fail at the last line:
> {code:java}
> test("bucket join cannot work for self-join with views") {
>     withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "1") {
>       withTable("t1") {
>         val df = (0 until 20).map(i => (i, i)).toDF("i", "j").as("df")
>         df.write
>           .format("parquet")
>           .bucketBy(8, "i")
>           .saveAsTable("t1")
>         sql(s"create view v1 as select * from t1").collect()
>         val plan1 = sql("SELECT * FROM t1 a JOIN t1 b ON a.i = b.i").queryExecution.executedPlan
>         assert(plan1.collect { case exchange : ShuffleExchangeExec => exchange }.isEmpty)
>         val plan2 = sql("SELECT * FROM t1 a JOIN v1 b ON a.i = b.i").queryExecution.executedPlan
>         assert(plan2.collect { case exchange : ShuffleExchangeExec => exchange }.isEmpty)
>       }
>     }
>   }
> {code}
> It's because View will add Project with Alias, then Join's requiredDistribution is based
on Alias, but ProjectExec passes child's outputPartition up without Alias. Then the satisfies
check cannot meet in EnsureRequirement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message