hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-24224) Fix skipping header/footer for Hive on Tez on compressed files
Date Fri, 02 Oct 2020 11:32:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-24224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ASF GitHub Bot updated HIVE-24224:
----------------------------------
    Labels: pull-request-available  (was: )

> Fix skipping header/footer for Hive on Tez on compressed files
> --------------------------------------------------------------
>
>                 Key: HIVE-24224
>                 URL: https://issues.apache.org/jira/browse/HIVE-24224
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Compressed file with Hive on Tez  returns header and footers - for both select * and
select count ( * ):
> {noformat}
> printf "offset,id,other\n9,\"20200315 X00 1356\",123\n17,\"20200315 X00 1357\",123\nrst,rst,rst"
> data.csv
> hdfs dfs -put -f data.csv /apps/hive/warehouse/bz2test/bz2tbl1/
> bzip2 -f data.csv 
> hdfs dfs -put -f data.csv.bz2 /apps/hive/warehouse/bz2test/bz2tbl2/
> beeline -e "CREATE EXTERNAL TABLE default.bz2tst2 (
>   sequence   int,
>   id         string,
>   other      string) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
> LOCATION '/apps/hive/warehouse/bz2test/bz2tbl2' 
> TBLPROPERTIES (
>   'skip.header.line.count'='1',
>   'skip.footer.line.count'='1');"
> beeline -e "
>   SET hive.fetch.task.conversion = none;
>   SELECT * FROM default.bz2tst2;"
> +-------------------+--------------------+----------------+
> | bz2tst2.sequence  |     bz2tst2.id     | bz2tst2.other  |
> +-------------------+--------------------+----------------+
> | offset            | id                 | other          |
> | 9                 | 20200315 X00 1356  | 123            |
> | 17                | 20200315 X00 1357  | 123            |
> | rst               | rst                | rst            |
> +-------------------+--------------------+----------------+
> {noformat}
> PS: HIVE-22769 addressed the issue for Hive on LLAP.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message