hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Work logged] (HIVE-21831) Stats should be reset correctly during load of a partitioned ACID table
Date Tue, 04 Jun 2019 16:00:00 GMT


ASF GitHub Bot logged work on HIVE-21831:

                Author: ASF GitHub Bot
            Created on: 04/Jun/19 15:59
            Start Date: 04/Jun/19 15:59
    Worklog Time Spent: 10m 
      Work Description: dlavati commented on pull request #659: HIVE-21831: Stats should be
reset correctly during load of a partitioned ACID table
   Change-Id: If5a56c80cb1a0f7dcf84d3d710e9290a95f93cbf
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

Issue Time Tracking

            Worklog Id:     (was: 253823)
            Time Spent: 10m
    Remaining Estimate: 0h

> Stats should be reset correctly during load of a partitioned ACID table
> -----------------------------------------------------------------------
>                 Key: HIVE-21831
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Import/Export
>    Affects Versions: 3.0.0, 3.1.0, 3.1.1
>            Reporter: David Lavati
>            Assignee: David Lavati
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
> While running something similar to the following example, I noticed that an import of
a partitioned ACID table using the ORC format fails to provide table statistics:
> {code:java}
> set hive.stats.autogather=true;
> set hive.stats.column.autogather=true;
> set hive.fetch.task.conversion=none;
> set;
> set hive.default.fileformat.managed=ORC;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create transactional table int_src (foo int, bar int);
> insert into int_src select 1,1;
> create transactional table int_exp(foo int) partitioned by (bar int);
> insert into int_exp select * from int_src;
> select count(*) from int_exp;
> create transactional table int_imp(foo int) partitioned by (bar int);
> EXPORT TABLE int_exp to '/tmp/expint';
> IMPORT TABLE int_imp FROM '/tmp/expint';
> select count(*) FROM int_imp;
> {code}
> The count returned 0 (opposed to 1, but even for 100k order of records it was 0) and
correct statistics were only available after running compute statistics.
> This was unique to ACID + partitioning + ORC, but this isn't the expected behavior.

This message was sent by Atlassian JIRA

View raw message