hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (Jira)" <>
Subject [jira] [Commented] (HIVE-22639) Bucket file name does not match bucket id after query based major compaction
Date Thu, 02 Jan 2020 16:27:00 GMT


Hive QA commented on HIVE-22639:

Here are the results of testing the latest attachment:

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17786 tests passed

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase

This message is automatically generated.

ATTACHMENT ID: 12989819 - PreCommit-HIVE-Build

> Bucket file name does not match bucket id after query based major compaction
> ----------------------------------------------------------------------------
>                 Key: HIVE-22639
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Aron Hamvas
>            Assignee: Aron Hamvas
>            Priority: Major
>         Attachments: HIVE-22639.1.patch, HIVE-22639.2.patch, HIVE-22639.patch
> While debugging 
> {{TestCrudCompactorOnTez#testCompactionWithSchemaEvolutionAndBuckets()}}, it has come
to my attention, that even though before compaction, the file name of the single bucket in
the delta directories is {{bucket_00001}}, in the new base, the name of the new single bucket
file is {{bucket_00000}}. At the same time, the bucket value in the ROW__ID of the records
remain the same and suggest that the bucket id is 1. 
> So the bucket id and the file name do not match. This could lead to problems.
> The test itself does not reveal this issue, although I think that the tests should check
this, too. At the same time, the tests assume the exact bucket id value in cases where it
cannot be predicted and fail, even though the bucket it does not change after the compaction,
so the check should really pass.

This message was sent by Atlassian Jira

View raw message