hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aihua Xu (JIRA)" <>
Subject [jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
Date Fri, 13 Jul 2018 20:35:00 GMT


Aihua Xu commented on HIVE-20153:

Yes. I'm able to download it. 

> Count and Sum UDF consume more memory in Hive 2+
> ------------------------------------------------
>                 Key: HIVE-20153
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>    Affects Versions: 2.3.2
>            Reporter: Szehon Ho
>            Assignee: Aihua Xu
>            Priority: Major
>         Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
> While playing with Hive2, we noticed that queries with a lot of count() and sum() aggregations
run out of memory on Hadoop side where they worked before in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our particular case from -Xmx2000M to -Xmx4000M), it makes it not so easy to upgrade
to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' in GeneraicUDAFSum
and GenericUDAFCount, which was added to support Window functions.

This message was sent by Atlassian JIRA

View raw message