spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew K Long (Jira)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-30052) Add support for specifying data sort order to the Datasource V2 API
Date Mon, 02 Dec 2019 19:31:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-30052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew K Long updated SPARK-30052:
----------------------------------
    Description: 
Summary:  When spark optimizes a physical plan,  Ensure requirements iterates through the
tree and attempts to add Sorts when the plans child requirements do not match the output sort
order.  Unfortunately a v2 datasource currently has no way of specifying to the optimizer
that the dataset is already ordered.

This would involve adding another API to the V2 Datasource api that would allow users to specify
underlying sort order of a datasource and include the plumbing to allow ensure requirements
to understand that sort order.

  was:Currently there is no way in the V2 api to specify the sort order of the data in a dataset.
These means that the optimizer will add sorts to data that is already sorted


> Add support for specifying data sort order to the Datasource V2 API
> -------------------------------------------------------------------
>
>                 Key: SPARK-30052
>                 URL: https://issues.apache.org/jira/browse/SPARK-30052
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Andrew K Long
>            Priority: Major
>
> Summary:  When spark optimizes a physical plan,  Ensure requirements iterates through
the tree and attempts to add Sorts when the plans child requirements do not match the output
sort order.  Unfortunately a v2 datasource currently has no way of specifying to the optimizer
that the dataset is already ordered.
> This would involve adding another API to the V2 Datasource api that would allow users
to specify underlying sort order of a datasource and include the plumbing to allow ensure
requirements to understand that sort order.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message