hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (Jira)" <>
Subject [jira] [Assigned] (HIVE-20327) Compactor should gracefully handle 0 length files and invalid orc files
Date Wed, 28 Jul 2021 14:11:00 GMT


Eugene Koifman reassigned HIVE-20327:

    Assignee:     (was: Eugene Koifman)

> Compactor should gracefully handle 0 length files and invalid orc files
> -----------------------------------------------------------------------
>                 Key: HIVE-20327
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>    Affects Versions: 2.0.0
>            Reporter: Eugene Koifman
>            Priority: Major
>         Attachments: HIVE-20327.02.patch
> Older versions of Streaming API did not handle interrupts well and could leave 0-length
ORC files behind which cannot be read.
> These should be just skipped.
> Other cases of file where ORC Reader cannot be created
> 1. regular write (1 txn delta) where the client died and didn't properly close the file
- this delta should be aborted and never read
> 2. streaming ingest write (delta_x_y, x < y).  There should always be a side file
if the file was not closed properly. (though it may still indicate that length is 0)
> If we check these cases and still can't create a reader, it should not silently skip
the file since the system thinks it contains at least some committed data but the file is
corrupted (and the side file doesn't point at a valid footer) - we should never be in this
situation and we should throw so that the end user can try manual intervention (where the
only option may be deleting the file)

This message was sent by Atlassian Jira

View raw message