hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
Date Wed, 16 Mar 2016 08:57:33 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197056#comment-15197056
] 

Lefty Leverenz commented on HIVE-11675:
---------------------------------------

Doc note:  This adds configuration parameter *hive.orc.splits.ms.footer.cache.ppd.enabled*
to HiveConf.java, so it needs to be documented in the ORC section of Configuration Properties
for release 2.1.0.

* [Configuration Properties -- ORC File Format | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat]

This also fixes a typo in the description of *hive.orc.splits.include.fileid*, which was added
to the llap branch by HIVE-10067 and to master for release 2.0.0 by HIVE-11542.

> make use of file footer PPD API in ETL strategy or separate strategy
> --------------------------------------------------------------------
>
>                 Key: HIVE-11675
>                 URL: https://issues.apache.org/jira/browse/HIVE-11675
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>              Labels: TODOC2.1
>             Fix For: 2.1.0
>
>         Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, HIVE-11675.03.patch, HIVE-11675.04.patch,
HIVE-11675.05.patch, HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, HIVE-11675.09.patch,
HIVE-11675.10.patch, HIVE-11675.11.patch, HIVE-11675.12.patch, HIVE-11675.13.patch, HIVE-11675.14.patch,
HIVE-11675.patch, HIVE-11675.premature.opti.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do filtering metastore
call for each partition. So perhaps we'd need the custom sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be pushed down
to metastore or fetched from local cache, that way the only slow threaded op is directory
listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message