spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hu Fuwang (Jira)" <>
Subject [jira] [Created] (SPARK-29615) Add insertInto method with byName parameter in DataFrameWriter
Date Sun, 27 Oct 2019 23:56:00 GMT
Hu Fuwang created SPARK-29615:

             Summary: Add insertInto method with byName parameter in DataFrameWriter
                 Key: SPARK-29615
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Hu Fuwang

Currently, the insertion through DataFrameWriter.insertInto method ignores the column names
and just uses position-based resolution. As DataFrameWriter only has one public insertInto
method, spark users may not check the description of this method and assume Spark will match
the columns by name. In such case, wrong column may be used as partition column, which may
result in problem (eg. huge amount of files/folders may be created in hive table tmp location).

We propose to add a new insertInto method in DataFrameWriter which has byName parameter for
Spark user to specify whether match columns by name.

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message