hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ádám Szita (Jira) <j...@apache.org>
Subject [jira] [Resolved] (HIVE-23956) Delete delta directory file information should be pushed to execution side
Date Wed, 05 Aug 2020 11:45:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-23956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ádám Szita resolved HIVE-23956.
-------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

> Delete delta directory file information should be pushed to execution side
> --------------------------------------------------------------------------
>
>                 Key: HIVE-23956
>                 URL: https://issues.apache.org/jira/browse/HIVE-23956
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Peter Varga
>            Assignee: Peter Varga
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Since HIVE-23840 LLAP cache is used to retrieve the tail of the ORC bucket files in the
delete deltas, but to use the cache the fileId must be determined, so one more FileSystem
call is issued for each bucket.
> This fileId is already available during compilation in the AcidState calculation, we
should serialise this to the OrcSplit, and remove the unnecessary FS calls.
> Furthermore instead of sending the SyntheticFileId directly, we should pass the attemptId
instead of the standard path hash, this way the path and the SyntheticFileId. can be calculated,
and it will work even, if the move free delete operations will be introduced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message