hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-10114) Split strategies for ORC
Date Tue, 07 Apr 2015 19:36:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483883#comment-14483883
] 

Gopal V commented on HIVE-10114:
--------------------------------

[~leftylev]: the hybrid mode needs explanation (the the others have been left in as a "we
want the old mode back" safety).

I am still tuning Hybrid to existing workloads - so the exact way it works will change before
we ship 1.2, I can document this once we branch.

> Split strategies for ORC
> ------------------------
>
>                 Key: HIVE-10114
>                 URL: https://issues.apache.org/jira/browse/HIVE-10114
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>             Fix For: 1.2.0
>
>         Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch, HIVE-10114.4.patch,
HIVE-10114.5.patch
>
>
> ORC split generation does not have clearly defined strategies for different scenarios
(many small orc files, few small orc files, many large files etc.). Few strategies like storing
the file footer in orc split, making entire file as a orc split already exists. This JIRA
to make the split generation simpler, support different strategies for various use cases (BI,
ETL, ACID etc.) and to lay the foundation for HIVE-7428.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message