hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aron Hamvas (Jira)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-22639) Bucket file name does not match bucket id after query based major compaction
Date Thu, 02 Jan 2020 13:15:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-22639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aron Hamvas updated HIVE-22639:
-------------------------------
    Status: Open  (was: Patch Available)

> Bucket file name does not match bucket id after query based major compaction
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-22639
>                 URL: https://issues.apache.org/jira/browse/HIVE-22639
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.1.0, 3.0.0
>            Reporter: Aron Hamvas
>            Assignee: Aron Hamvas
>            Priority: Major
>         Attachments: HIVE-22639.1.patch, HIVE-22639.2.patch, HIVE-22639.patch
>
>
> While debugging 
> {{TestCrudCompactorOnTez#testCompactionWithSchemaEvolutionAndBuckets()}}, it has come
to my attention, that even though before compaction, the file name of the single bucket in
the delta directories is {{bucket_00001}}, in the new base, the name of the new single bucket
file is {{bucket_00000}}. At the same time, the bucket value in the ROW__ID of the records
remain the same and suggest that the bucket id is 1. 
> So the bucket id and the file name do not match. This could lead to problems.
> The test itself does not reveal this issue, although I think that the tests should check
this, too. At the same time, the tests assume the exact bucket id value in cases where it
cannot be predicted and fail, even though the bucket it does not change after the compaction,
so the check should really pass.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message