hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt McCline (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
Date Thu, 07 Sep 2017 00:13:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156233#comment-16156233
] 

Matt McCline commented on HIVE-17460:
-------------------------------------

I don't think this is right -- you will end up with upset customers because query results
will be different.

Unfortunately, the current semantics of adding a column are that the default behavior is RESTRICT
not CASCADE.  RESTRICT means the partition schema's do not get updated with the new columns.
 Thus, the new columns default to NULL when queried.  In order to get the behavior you are
talking about you would need to specify the CASCADE option.

So I'm a -1 on this change.

[~wzheng]

> `insert overwrite` should support table schema evolution (e.g. add columns)
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-17460
>                 URL: https://issues.apache.org/jira/browse/HIVE-17460
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.1.0, 2.2.0
>            Reporter: Chaozhong Yang
>            Assignee: Chaozhong Yang
>             Fix For: 3.0.0
>
>         Attachments: HIVE-17460.2.patch, HIVE-17460.patch
>
>
> In Hive, adding columns into original table is a common use case. However, if we insert
overwrite older partitions after adding columns, added columns will not be accessed.
> ```
> create table src_table(
>         i int
> )
> PARTITIONED BY (`date` string);
> insert overwrite table src_table partition(`date`='20170905') valu
> es (3);
> select * from src_table where `date` = '20170905';
> alter table src_table add columns (bi bigint);
> insert overwrite table src_table partition(`date`='20170905') valu
> es (3, 5);
> select * from src_table where `date` = '20170905';
> ```
> The result will be as follows:
> ```
> 3, NULL, '20170905'
> ```
> Obviously, it doesn't meet our expectation. The expected result should be:
> ```
> 3, 5, '20170905'
> ```



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message