hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teddy Choi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables
Date Fri, 02 Jun 2017 01:23:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034000#comment-16034000
] 

Teddy Choi commented on HIVE-12631:
-----------------------------------

Thank you for feedback, [~sershe].

1. I will merge OrcAcidColumnVectorProducer with OrcColumnVectorProducer.
2. DeleteEventsOverflowMemoryException is thrown when totalDeleteEventCount is larger than
maxEventsInMemory, which is configurable by ConfVars.HIVE_TRANSACTIONAL_NUM_EVENTS_IN_MEMORY.
So it will not cause OOM. It may be a better idea to return a boolean value to avoid confusion
from the exception name.
3. The BitSet part is copied from VectorizedOrcAcidRowBatchReader of HIVE-14233. I will consider
replacing both of them with boolean arrays.
4. I will make canUseLlapIo to return true for V2 base files and add V2 base file tests in
llap_acid.q to ensure V2 support.

> LLAP: support ORC ACID tables
> -----------------------------
>
>                 Key: HIVE-12631
>                 URL: https://issues.apache.org/jira/browse/HIVE-12631
>             Project: Hive
>          Issue Type: Bug
>          Components: llap, Transactions
>            Reporter: Sergey Shelukhin
>            Assignee: Teddy Choi
>         Attachments: HIVE-12631.1.patch, HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch,
HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, HIVE-12631.8.patch, HIVE-12631.8.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and parallelization
of reads and processing. This path does not support ACID. As far as I remember ACID logic
is embedded inside ORC format; we need to refactor it to be on top of some interface, if practical;
or just port it to LLAP read path.
> Another consideration is how the logic will work with cache. The cache is currently low-level
(CB-level in ORC), so we could just use it to read bases and deltas (deltas should be cached
with higher priority) and merge as usual. We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message