hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <>
Subject [jira] [Commented] (HIVE-17970) MM LOAD DATA with OVERWRITE doesn't use base_n directory concept
Date Thu, 05 Apr 2018 19:26:00 GMT


Sergey Shelukhin commented on HIVE-17970:

Fixing the HCat build issue... not sure why it didn't trigger locally. Maybe maven is smart
enough to exclude HCat from my local build based on usefulness heuristic.

> MM LOAD DATA with OVERWRITE doesn't use base_n directory concept
> ----------------------------------------------------------------
>                 Key: HIVE-17970
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>    Affects Versions: 3.0.0
>            Reporter: Eugene Koifman
>            Assignee: Sergey Shelukhin
>            Priority: Major
>              Labels: mm-gap-2
>         Attachments: HIVE-17970.01.patch, HIVE-17970.patch
> Judging by 
> {code:java}
> Hive.loadTable(Path loadPath, String tableName, LoadFileType loadFileType, boolean isSrcLocal,
>       boolean isSkewedStoreAsSubdir, boolean isAcid, boolean hasFollowingStatsTask,
>       Long txnId, int stmtId, boolean isMmTable)
> {code}
> LOAD DATA with OVERWRITE will delete all existing data then write new data into the table.
 This logic makes sense for non-acid tables but for Acid/MM it should work like INSERT OVERWRITE
statement and write new data to base_n/. This way the lock manager can be used to either get
an X lock for IOW and thus block all readers or let it run with SemiShared and let readers
continue and make the system more concurrent.

This message was sent by Atlassian JIRA

View raw message