hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Work logged] (HIVE-21831) Stats should be reset correctly during load of a partitioned ACID table
Date Wed, 12 Jun 2019 08:50:00 GMT


ASF GitHub Bot logged work on HIVE-21831:

                Author: ASF GitHub Bot
            Created on: 12/Jun/19 08:49
            Start Date: 12/Jun/19 08:49
    Worklog Time Spent: 10m 
      Work Description: dlavati-hw commented on issue #659: HIVE-21831: Stats should be reset
correctly during load of a partitioned ACID table
   This can be closed, as it was merged as f4be42c5726a01479991ba6b6b6dc93f776648d5.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

Issue Time Tracking

    Worklog Id:     (was: 258531)
    Time Spent: 20m  (was: 10m)

> Stats should be reset correctly during load of a partitioned ACID table
> -----------------------------------------------------------------------
>                 Key: HIVE-21831
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Import/Export
>    Affects Versions: 3.0.0, 3.1.0, 3.1.1
>            Reporter: David Lavati
>            Assignee: David Lavati
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>         Attachments: HIVE-21831.01.patch, HIVE-21831.02.patch, HIVE-21831.02.patch
>          Time Spent: 20m
>  Remaining Estimate: 0h
> While running something similar to the following example, I noticed that an import of
a partitioned ACID table using the ORC format fails to provide table statistics:
> {code:java}
> set hive.stats.autogather=true;
> set hive.stats.column.autogather=true;
> set hive.fetch.task.conversion=none;
> set;
> set hive.default.fileformat.managed=ORC;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create transactional table int_src (foo int, bar int);
> insert into int_src select 1,1;
> create transactional table int_exp(foo int) partitioned by (bar int);
> insert into int_exp select * from int_src;
> select count(*) from int_exp;
> create transactional table int_imp(foo int) partitioned by (bar int);
> EXPORT TABLE int_exp to '/tmp/expint';
> IMPORT TABLE int_imp FROM '/tmp/expint';
> select count(*) FROM int_imp;
> {code}
> The count returned 0 (opposed to 1, but even for 100k order of records it was 0) and
correct statistics were only available after running compute statistics.
> This was unique to ACID + partitioning + ORC, but this isn't the expected behavior.

This message was sent by Atlassian JIRA

View raw message