hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <>
Subject [jira] [Commented] (HIVE-16663) String Caching For Rows
Date Tue, 24 Oct 2017 23:37:00 GMT


Ashutosh Chauhan commented on HIVE-16663:

+1 pending tests

> String Caching For Rows
> -----------------------
>                 Key: HIVE-16663
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Beeline
>    Affects Versions: 2.0.1
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Minor
>         Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, HIVE-16663.3.patch, HIVE-16663.4.patch,
HIVE-16663.5.patch, HIVE-16663.6.patch, HIVE-16663.7.patch
> It is very common that there are many repeated values in the result set of a query, especially
when JOINs are present in the query.  As it currently stands, beeline does not attempt to
cache any of these values and therefore it consumes a lot of memory.
> Adding a string cache may save a lot of memory.  There are organizations that use beeline
to perform ETL processing of result sets into CSV.  This will better support those organizations.

This message was sent by Atlassian JIRA

View raw message