spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eyal Farago (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-28304) FileFormatWriter introduces an uncoditional join, even when all attributes are constants
Date Mon, 08 Jul 2019 14:22:00 GMT
Eyal Farago created SPARK-28304:
-----------------------------------

             Summary: FileFormatWriter introduces an uncoditional join, even when all attributes
are constants
                 Key: SPARK-28304
                 URL: https://issues.apache.org/jira/browse/SPARK-28304
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.2
            Reporter: Eyal Farago


FileFormatWriter derives a required sort order based on the partition columns, bucketing columns
and explicitly required ordering. However in some use cases Some (or even all) of these fields
are constant, in these cases the sort can be skipped.

i.e. in my use-case, we add a GUUID column identifying a specific (incremental) load, this
can be thought of as a batch id. Since we run one batch at a time, this column is always a
constant which means there's no need to sort based on this column, since we don't use bucketing
or require an explicit ordering the entire sort can be skipped for our case.

 

I suggest:
 # filter away constant columns from the required ordering calculated by FileFormatWriter 
 # generalizing this to any Sort operator in a spark plan.
 # introduce optimizer rules to remove constants from sort ordering, potentially eliminating
the sort operator altogether.
 # modify EnsureRequirements to be aware of constant field when deciding whether to introduce
a sort or not. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message