spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dongjoon Hyun (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-14939) [SPARK-14939][SQL] Add FoldablePropagation optimizer
Date Mon, 02 May 2016 18:27:13 GMT

     [ https://issues.apache.org/jira/browse/SPARK-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dongjoon Hyun updated SPARK-14939:
----------------------------------
    Description: 
This issue aims to add new FoldablePropagation optimizer that propagates foldable expressions
up over the logical plan nodes. Currently, ORDER BY, GROUP BY, Nested-SELECT clauses are supported,
and aliases and ordinal expressions are the main target to be transformed after propagation.
Other optimizations will take advantage of the propagated foldable expressions: e.g. EliminateSorts
optimizer now can handle Case 2 and 3. (Case 1 is the previous implementation.)

1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"

**Before**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
:     +- INPUT
+- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
   +- WholeStageCodegen
      :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
      :     +- INPUT
      +- Scan OneRowRelation[]
{code}

**After**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
:     +- INPUT
+- Scan OneRowRelation[]
{code}

  was:
This issue aims to improve `EliminateSorts` optimizer to handle ordinal (case 2) and alias
(case 3). Case 1 is the current implementation.

1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"

**Before**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
:     +- INPUT
+- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
   +- WholeStageCodegen
      :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
      :     +- INPUT
      +- Scan OneRowRelation[]
{code}

**After**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
:     +- INPUT
+- Scan OneRowRelation[]
{code}

    Component/s:     (was: SQL)
                 Optimizer
        Summary: [SPARK-14939][SQL] Add FoldablePropagation optimizer  (was: Improve EliminateSorts
optimizer to handle Ordinal/Alias SortOrders)

> [SPARK-14939][SQL] Add FoldablePropagation optimizer
> ----------------------------------------------------
>
>                 Key: SPARK-14939
>                 URL: https://issues.apache.org/jira/browse/SPARK-14939
>             Project: Spark
>          Issue Type: Improvement
>          Components: Optimizer
>            Reporter: Dongjoon Hyun
>
> This issue aims to add new FoldablePropagation optimizer that propagates foldable expressions
up over the logical plan nodes. Currently, ORDER BY, GROUP BY, Nested-SELECT clauses are supported,
and aliases and ordinal expressions are the main target to be transformed after propagation.
Other optimizations will take advantage of the propagated foldable expressions: e.g. EliminateSorts
optimizer now can handle Case 2 and 3. (Case 1 is the previous implementation.)
> 1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
> 2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
> 3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"
> **Before**
> {code}
> scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
> == Physical Plan ==
> WholeStageCodegen
> :  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
> :     +- INPUT
> +- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
>    +- WholeStageCodegen
>       :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
>       :     +- INPUT
>       +- Scan OneRowRelation[]
> {code}
> **After**
> {code}
> scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
> == Physical Plan ==
> WholeStageCodegen
> :  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
> :     +- INPUT
> +- Scan OneRowRelation[]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message