hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <>
Subject [jira] [Updated] (HIVE-13232) Aggressively drop compression buffers in ORC OutStreams
Date Mon, 14 Mar 2016 15:45:34 GMT


Owen O'Malley updated HIVE-13232:
    Attachment: HIVE-13232.patch

At first I didn't think that I could unit test this change, but then I realized that I could
use the OutStream.getBufferSize to observe the change. This patch just adds the new test.

> Aggressively drop compression buffers in ORC OutStreams
> -------------------------------------------------------
>                 Key: HIVE-13232
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.14.1, 1.3.0, 2.1.0
>         Attachments: HIVE-13232.patch, HIVE-13232.patch, HIVE-13232.patch
> In Hive 0.11, when ORC's OutStream's were flushed they dropped all of the their buffers.
In the patch for HIVE-4324, we inadvertently changed that behavior so that one of the buffers
is held on to. For queries with a lot of writers and thus under significant memory pressure
this can have a significant impact on the memory usage. 
> Note that "hive.optimize.sort.dynamic.partition" avoids this problem by sorting on the
dynamic partition key and thus only a single ORC writer is open at once. This will use memory
more effectively and avoid creating ORC files with very small stripes, which will produce
better downstream performance.

This message was sent by Atlassian JIRA

View raw message