hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-17852) remove support for list bucketing "stored as directories" in 3.0
Date Mon, 25 Jun 2018 16:14:00 GMT


Hive QA commented on HIVE-17852:

Here are the results of testing the latest attachment:

{color:green}SUCCESS:{color} +1 due to 34 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14600 tests executed
*Failed tests:*

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed

This message is automatically generated.

ATTACHMENT ID: 12929022 - PreCommit-HIVE-Build

> remove support for list bucketing "stored as directories" in 3.0
> ----------------------------------------------------------------
>                 Key: HIVE-17852
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Laszlo Bodor
>            Priority: Major
>             Fix For: 4.0.0
>         Attachments: HIVE-17852.01.patch, HIVE-17852.02.patch, HIVE-17852.03.patch, HIVE-17852.04.patch,
HIVE-17852.05.patch, HIVE-17852.06.patch, HIVE-17852.07.patch, HIVE-17852.08.patch, HIVE-17852.09.patch,
HIVE-17852.10.patch, HIVE-17852.11.patch, HIVE-17852.12.patch
> From the email thread:
> 1) LB, when stored as directories, adds a lot of low-level complexity to Hive tables
that has to be accounted for in many places in the code where the files are written or modified
- from FSOP to ACID/replication/export.
> 2) While working on some FSOP code I noticed that some of that logic is broken - e.g.
the duplicate file removal from tasks, a pretty fundamental correctness feature in Hive, may
be broken. LB also doesn’t appear to be compatible with e.g. regular bucketing.
> 3) The feature hasn’t seen development activity in a while; it also doesn’t appear
to be used a lot.
> Keeping with the theme of cleaning up “legacy” code for 3.0, I was proposing we remove
> (2) also suggested that, if needed, it might be easier to implement similar functionality
by adding some flexibility to partitions (which LB directories look like anyway); that would
also keep the logic on a higher level of abstraction (split generation, partition pruning)
as opposed to many low-level places like FSOP, etc.

This message was sent by Atlassian JIRA

View raw message