spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dongjoon Hyun (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-14939) Add FoldablePropagation optimizer
Date Sun, 08 May 2016 00:51:12 GMT

     [ https://issues.apache.org/jira/browse/SPARK-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dongjoon Hyun updated SPARK-14939:
----------------------------------
    Description: 
This issues aims to add new FoldablePropagation optimizer that propagates foldable expressions.
Basically, FoldablePropagation replaces all attributes with the aliases of original foldable
expression except Union/Command queries. Aliases and ordinal expressions are the main target
to be transformed after propagation. Other optimizations will take advantage of the propagated
foldable expressions: e.g. EliminateSorts optimizer now can handle the following Case 2 and
3. (Case 1 is the previous implementation.)

1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"

**Before**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
:     +- INPUT
+- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
   +- WholeStageCodegen
      :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
      :     +- INPUT
      +- Scan OneRowRelation[]
{code}

**After**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
:     +- INPUT
+- Scan OneRowRelation[]
{code}

  was:
This issue aims to add new FoldablePropagation optimizer that propagates foldable expressions
up over the logical plan nodes. Currently, ORDER BY, GROUP BY, Nested-SELECT clauses are supported,
and aliases and ordinal expressions are the main target to be transformed after propagation.
Other optimizations will take advantage of the propagated foldable expressions: e.g. EliminateSorts
optimizer now can handle Case 2 and 3. (Case 1 is the previous implementation.)

1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"

**Before**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
:     +- INPUT
+- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
   +- WholeStageCodegen
      :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
      :     +- INPUT
      +- Scan OneRowRelation[]
{code}

**After**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
:     +- INPUT
+- Scan OneRowRelation[]
{code}


> Add FoldablePropagation optimizer
> ---------------------------------
>
>                 Key: SPARK-14939
>                 URL: https://issues.apache.org/jira/browse/SPARK-14939
>             Project: Spark
>          Issue Type: Improvement
>          Components: Optimizer
>            Reporter: Dongjoon Hyun
>
> This issues aims to add new FoldablePropagation optimizer that propagates foldable expressions.
Basically, FoldablePropagation replaces all attributes with the aliases of original foldable
expression except Union/Command queries. Aliases and ordinal expressions are the main target
to be transformed after propagation. Other optimizations will take advantage of the propagated
foldable expressions: e.g. EliminateSorts optimizer now can handle the following Case 2 and
3. (Case 1 is the previous implementation.)
> 1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
> 2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
> 3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"
> **Before**
> {code}
> scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
> == Physical Plan ==
> WholeStageCodegen
> :  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
> :     +- INPUT
> +- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
>    +- WholeStageCodegen
>       :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
>       :     +- INPUT
>       +- Scan OneRowRelation[]
> {code}
> **After**
> {code}
> scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
> == Physical Plan ==
> WholeStageCodegen
> :  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
> :     +- INPUT
> +- Scan OneRowRelation[]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message