[ https://issues.apache.org/jira/browse/HIVE-20109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Shelukhin updated HIVE-20109:
------------------------------------
Attachment: HIVE-20109.nogen.patch
> get rid of COLUMN_STATS_ACCURATE
> --------------------------------
>
> Key: HIVE-20109
> URL: https://issues.apache.org/jira/browse/HIVE-20109
> Project: Hive
> Issue Type: Bug
> Components: Statistics
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Priority: Major
> Attachments: HIVE-20109.nogen.patch
>
>
> I don't know why anyone would come up with an idea of storing a set of booleans in a
database using JSON. This has caused various problems in the past (text field limitations,
perf issues when parsing a giant string; also bugs because the way it is set is brittle).
> However, now that we are implementing transactional stats, it becomes especially problematic
and error prone because the code in Hive sets C_S_A in random places with reckless abandon,
whereas we want to change the state of the stats in well defined places where txn semantics
can be verified.
> Currently in HIVE-19416, we are handling random things that touch it (from metastore
itself to output committers, various stats tasks, commands like truncate, etc.) via a pile
of hacks, but the best solution would be to remove it completely and replace with a DB table/columns
in stats tables that would need to be set explicitly, not via generic alter_table.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
|