hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <>
Subject [jira] [Updated] (HIVE-18772) Make Acid Cleaner use MIN_HISTORY_LEVEL
Date Sat, 25 Aug 2018 01:02:00 GMT


Eugene Koifman updated HIVE-18772:
    Status: Patch Available  (was: Open)

> Make Acid Cleaner use MIN_HISTORY_LEVEL
> ---------------------------------------
>                 Key: HIVE-18772
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>    Affects Versions: 3.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Major
>         Attachments: HIVE-18772.01.patch
> Instead of using Lock Manager state as it currently does.
> This will eliminate possible race conditions
> See this [comment|]
> Suppose A is the set of all ValidTxnList across all active readers.  Each ValidTxnList
has minOpenTxnId.
> MIN_HISTORY_LEVEL allows us to determine X = min(minOpenTxnId) across all currently active
> This means that no active transaction in the system sees any txn with txnid < X as
> This means if construct ValidTxnIdList with HWM=X-1 and use that in getAcidState(), any
files determined by this call as 'obsolete', will be seen as obsolete by any existing/future
reader, i.e. can be physically deleted.
> This is also necessary for multi-statement transactions where relying on the state of
Lock Manager is not sufficient.  For example
> Suppose txn 17 starts at t1 and sees txnid 13 with writeID 13 open.
> 13 commits (via it's parent txn) at t2 > t1.  (17 is still running).
> Compaction runs at t3 >t2 to produce base_14 (or delta_10_14 for example) on Table1/Part1
(17 is still running)
> Now delta_13 may be cleaned since it can be seen as obsolete and there may be no locks
on it, i.e. no one is reading it.
> Now at t4 > t3 17 may (multi stmt txn) needs to read Table1/Part1. It cannot use base_14
is that may have absorbed delete events from delete_delta_14.
> Using MIN_HISTORY_LEVEL solves this.
> See description of HIVE-18747 for more details on MIN_HISTORY_LEVEL

This message was sent by Atlassian JIRA

View raw message