spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gengliang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-26744) Support schema validation in File Source V2
Date Fri, 01 Feb 2019 03:28:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-26744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gengliang Wang updated SPARK-26744:
-----------------------------------
    Description: 
The method supportDataType in FileFormat helps to validate the output/input schema before
execution starts. So that we can avoid some invalid data source IO, and users can see clean
error messages.

This PR is to implement the same method in the FileDataSourceV2 framework. Comparing to FileFormat,
FileDataSourceV2 has multiple layers. The API is added in two places:

1. FileWriteBuilder: this is where we can get the actual write schema
2. FileScan: this is where we can get the actual read schema.

> Support schema validation in File Source V2
> -------------------------------------------
>
>                 Key: SPARK-26744
>                 URL: https://issues.apache.org/jira/browse/SPARK-26744
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Gengliang Wang
>            Priority: Major
>
> The method supportDataType in FileFormat helps to validate the output/input schema before
execution starts. So that we can avoid some invalid data source IO, and users can see clean
error messages.
> This PR is to implement the same method in the FileDataSourceV2 framework. Comparing
to FileFormat, FileDataSourceV2 has multiple layers. The API is added in two places:
> 1. FileWriteBuilder: this is where we can get the actual write schema
> 2. FileScan: this is where we can get the actual read schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message