spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lai Zhou (Jira)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-28860) Using ColumnStats of join key to get TableAccessCardinality when finding star joins in ReorderJoinRule
Date Fri, 23 Aug 2019 07:44:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-28860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lai Zhou updated SPARK-28860:
-----------------------------
    Description: 
Now the star-schema detection uses TableAccessCardinality to reorder DimTables  when there
is a selectiveStarJoin . 

[StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]
{code:java}
if (isSelectiveStarJoin(dimTables, conditions)) { 
val reorderDimTables = dimTables.map { 
plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) }
.sortBy(_.size).map { 
case TableAccessCardinality(p1, _) => p1
 }{code}
 

But the getTableAccessCardinality method does't consider the ColumnStats of the equi-join-key.
I'm not sure if we should compute Join cardinality for the dimTable based on it's join key
here.

[~ioana-delaney]

 

 

 

 

  was:
Now the star-schema detection uses TableAccessCardinality to reorder DimTables  when there
is a selectiveStarJoin . 

[StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]

 
{code:java}
if (isSelectiveStarJoin(dimTables, conditions)) { 
val reorderDimTables = dimTables.map { plan => TableAccessCardinality(plan, getTableAccessCardinality(plan))
}.sortBy(_.size).map { case TableAccessCardinality(p1, _) => p1 }{code}
 

 

But the getTableAccessCardinality method does't consider the ColumnStats of the equi-join-key.
I'm not sure if we should compute Join cardinality for the dimTable based on it's

join key here.

[~ioana-delaney]

 

 

 

 


>  Using ColumnStats of join key to get TableAccessCardinality when finding star joins
in ReorderJoinRule
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-28860
>                 URL: https://issues.apache.org/jira/browse/SPARK-28860
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.3
>            Reporter: Lai Zhou
>            Priority: Minor
>
> Now the star-schema detection uses TableAccessCardinality to reorder DimTables  when
there is a selectiveStarJoin . 
> [StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]
> {code:java}
> if (isSelectiveStarJoin(dimTables, conditions)) { 
> val reorderDimTables = dimTables.map { 
> plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) }
> .sortBy(_.size).map { 
> case TableAccessCardinality(p1, _) => p1
>  }{code}
>  
> But the getTableAccessCardinality method does't consider the ColumnStats of the equi-join-key.
I'm not sure if we should compute Join cardinality for the dimTable based on it's join key
here.
> [~ioana-delaney]
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message