hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant
Date Fri, 20 Oct 2017 23:02:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213453#comment-16213453
] 

Sergey Shelukhin commented on HIVE-17856:
-----------------------------------------

Ok I noticed this while looking at the union bug. I think the reason IOW "worked for me" in
some of the tests and works for [~steveyeom2017] when he runs the above test is that it's
still using old MM logic that is not ACID compliant.
This call in Hive.java:
{noformat}
deleteOldPathForReplace(newPartPath, oldPartPath, getConf(), isAutoPurge,
              new JavaUtils.IdPathFilter(txnId, stmtId, false,
true), true,
              tbl.isStoredAsSubDirectories() ? tbl.getSkewedColNames().size()
: 0);
{noformat}
Deletes old delta directories by means of IdPathFilter with 3rd arg being false, which means,
return ALL delta directories that don't match txnId.
So, all the other data in the table gets nuked.
This is the implementation is incorrect for ACID integration.

So, delete for MM table codepath needs to be removed (easy to find by looking where IdPathFilter
is used with isMatch == false, meaning "find every txn except this one").
Then, IOW will probably stop working w.r.t. "overwrite" because old deltas will stick around.
After that Eugene can comment on where and how ACID uses base directories to implement IOW.
I suspect that in IOW case, it will be as simple as instead of creating delta_.... dir for
output, creating base_.... directory; and that would be enough for all the logic shared with
ACID, e.g. compactor, to handle it correctly. The code that finds what to read in HiveInputFormat
would also need to be updated to take committed base-s into account. But I am not familiar
with ACID IOW, so it may not be as simple.

cc [~hagleitn] [~steveyeom2017] [~ekoifman]



> MM tables - IOW is not ACID compliant
> -------------------------------------
>
>                 Key: HIVE-17856
>                 URL: https://issues.apache.org/jira/browse/HIVE-17856
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>            Reporter: Sergey Shelukhin
>            Assignee: Steve Yeom
>              Labels: mm-gap-1
>
> The following tests were removed from mm_all during "integration"... I should have never
allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, here we
are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  tblproperties("transactional"="true",
"transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, key from
intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, key +
2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties ("transactional"="true",
"transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message