hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-19890) ACID: Inherit bucket-id from original ROW_ID for delete deltas
Date Fri, 22 Jun 2018 22:36:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-19890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520849#comment-16520849
] 

Gopal V commented on HIVE-19890:
--------------------------------

Thanks, I've added comments to describe how the createDynamicBucket() works like multi-file
spray.

{code}
+    /**
+     * This method is intended for use with ACID unbucketed tables, where the DELETE ops
behave as
+     * though they are bucketed, but without an explicit pre-specified bucket count. The
bucketNum
+     * is read out of the middle value of the ROW__ID variable and this is written out from
a single
+     * FileSink, in ways similar to the multi file spray, but without knowing the total number
of
+     * buckets ahead of time.
+     *
+     * ROW__ID (1,2[0],3) => bucket_00002
+     * ROW__ID (1,3[0],4) => bucket_00003 etc
+     *
+     * A new FSP is created for each partition, so this only requires the bucket numbering
and that
+     * is mapped in directly as an index.
+     */
{code}

> ACID: Inherit bucket-id from original ROW_ID for delete deltas
> --------------------------------------------------------------
>
>                 Key: HIVE-19890
>                 URL: https://issues.apache.org/jira/browse/HIVE-19890
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 3.0.0
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Major
>         Attachments: HIVE-19890.1.patch, HIVE-19890.2.patch, HIVE-19890.3.patch
>
>
> The ACID delete deltas for unbucketed tables are written to arbitrary files, which should
instead be shuffled using the bucket-id instead of hash(ROW__ID).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message