hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <>
Subject [jira] [Updated] (HIVE-20109) get rid of COLUMN_STATS_ACCURATE
Date Sat, 04 Aug 2018 02:53:00 GMT


Sergey Shelukhin updated HIVE-20109:
    Attachment: HIVE-20109.nogen.patch

> --------------------------------
>                 Key: HIVE-20109
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Major
>         Attachments: HIVE-20109.nogen.patch
> I don't know why anyone would come up with an idea of storing a set of booleans in a
database using JSON. This has caused various problems in the past (text field limitations,
perf issues when parsing a giant string; also bugs because the way it is set is brittle).
> However, now that we are implementing transactional stats, it becomes especially problematic
and error prone because the code in Hive sets C_S_A in random places with reckless abandon,
whereas we want to change the state of the stats in well defined places where txn semantics
can be verified.
> Currently in HIVE-19416, we are handling random things that touch it (from metastore
itself to output committers, various stats tasks, commands like truncate, etc.) via a pile
of hacks, but the best solution would be to remove it completely and replace with a DB table/columns
in stats tables that would need to be set explicitly, not via generic alter_table.

This message was sent by Atlassian JIRA

View raw message