spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hu Fuwang (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-29615) Add insertInto method with byName parameter in DataFrameWriter
Date Mon, 28 Oct 2019 05:12:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-29615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hu Fuwang resolved SPARK-29615.
-------------------------------
    Resolution: Not A Problem

> Add insertInto method with byName parameter in DataFrameWriter
> --------------------------------------------------------------
>
>                 Key: SPARK-29615
>                 URL: https://issues.apache.org/jira/browse/SPARK-29615
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Hu Fuwang
>            Priority: Major
>
> Currently, the insertion through DataFrameWriter.insertInto method ignores the column
names and just uses position-based resolution. As DataFrameWriter only has one public insertInto
method, spark users may not check the description of this method and assume Spark will match
the columns by name. In such case, wrong column may be used as partition column, which may
result in problem (eg. huge amount of files/folders may be created in hive table tmp location).
> We propose to add a new insertInto method in DataFrameWriter which has byName parameter
for Spark user to specify whether match columns by name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message