hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-18702) INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting
Date Thu, 13 Jun 2019 02:53:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-18702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862664#comment-16862664
] 

Rajesh Balamohan commented on HIVE-18702:
-----------------------------------------

Thanks for creating and fixing this ticket. There is another corner case where this would
still show wrong result.

1. Table `A` is created with partition `y`.
2. Data is added by external system (say y="1"), but not yet registered in table A. 
3. Run `insert ovewrite` on table A
4. This should still show old contents, because in this case `oldPartPath` would be null.
So the actual data wouldn't be deleted.

I will create a separate ticket to track this issue with a small sample testcase.

> INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-18702
>                 URL: https://issues.apache.org/jira/browse/HIVE-18702
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.3.2
>            Reporter: Oleksiy Sayankin
>            Assignee: Ivan Suller
>            Priority: Major
>             Fix For: 4.0.0
>
>         Attachments: HIVE-18702.1.patch, HIVE-18702.2.patch, HIVE-18702.3.patch, HIVE-18702.3.patch,
HIVE-18702.4.patch, HIVE-18702.5.patch
>
>
> Enable Hive on TEZ. (MR works fine).
> *STEP 1. Create test data*
> {code}
> nano /home/test/users.txt
> {code}
> Add to file:
> {code}
> Peter,34
> John,25
> Mary,28
> {code}
> {code}
> hadoop fs -mkdir /bug
> hadoop fs -copyFromLocal /home/test/users.txt /bug
> hadoop fs -ls /bug
> {code}
> *EXPECTED RESULT:*
> {code}
> Found 2 items                                                                   
> -rwxr-xr-x   3 root root         25 2015-10-15 16:11 /bug/users.txt
> {code}
> *STEP 2. Upload data to hive*
> {code}
> create external table bug(name string, age int) ROW FORMAT DELIMITED FIELDS TERMINATED
BY ',' LINES TERMINATED BY '\n' LOCATION '/bug';
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Peter   34
> John    25
> Mary    28
> {code}
> {code}
> create external table bug1(name string, age int) ROW FORMAT DELIMITED FIELDS TERMINATED
BY ',' LINES TERMINATED BY '\n' LOCATION '/bug1';
> insert overwrite table bug select * from bug1;
> select * from bug;
> {code}
> *EXPECTED RESULT:*
> {code}
> OK
> Time taken: 0.097 seconds
> {code}
> *ACTUAL RESULT:*
> {code}
> hive>  select * from bug;
> OK
> Peter	34
> John	25
> Mary	28
> Time taken: 0.198 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message