spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gengliang Wang (JIRA)" <>
Subject [jira] [Updated] (SPARK-28495) Follow ANSI SQL on table insertion
Date Thu, 01 Aug 2019 07:35:00 GMT


Gengliang Wang updated SPARK-28495:
    Summary: Follow ANSI SQL on table insertion  (was: AssignableCast: A new type coercion
following store assignment rules of ANSI SQL)

> Follow ANSI SQL on table insertion
> ----------------------------------
>                 Key: SPARK-28495
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Gengliang Wang
>            Priority: Major
> In Spark version 2.4 and earlier, when inserting into a table, Spark will cast the data
type of input query to the data type of target table by coercion. This can be super confusing,
e.g. users make a mistake and write string values to an int column.
> In data source V2,  by default, only upcasting is allowed when inserting data into a
table. E.g. int -> long and int -> string are allowed, while decimal -> double or
long -> int are not allowed. The rules of UpCast was originally created for Dataset type
coercion. They are quite strict and different from the behavior of all existing popular DBMS.
This is breaking change. It is possible that it would hurt some Spark users after 3.0 releases.
> This PR proposes that we can follow the rules of store assignment(section 9.2) in ANSI
SQL. Two significant differences from Up-Cast:
> 1. Any numeric type can be assigned to another numeric type.
> 2. TimestampType can be assigned DateType
> The new behavior is consistent with PostgreSQL. It is more explainable and acceptable
than using UpCast .

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message