hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Dere (JIRA)" <>
Subject [jira] [Commented] (HIVE-18252) Limit the size of the object inspector caches
Date Fri, 08 Dec 2017 21:16:01 GMT


Jason Dere commented on HIVE-18252:

I don't really want to go through the process of adding equals/hashcode to all of the object
inspectors - they were not created with this in mind and none of them implement it. I'll add
a check to disable caching in the case of constant object inspectors, in addition to limiting
the amount of object inspectors kept in the cache.

> Limit the size of the object inspector caches
> ---------------------------------------------
>                 Key: HIVE-18252
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Types
>            Reporter: Jason Dere
>            Assignee: Jason Dere
> Was running some tests that had a lot of queries with constant values, and noticed that
ObjectInspectorFactory.cachedStandardStructObjectInspector started using up a lot of memory.
> It appears that StructObjectInspector caching does not work properly with constant values.
Constant ObjectInspectors are not cached, so each constant expression creates a new constant
ObjectInspector. And since object inspectors do not override equals(), object inspector comparison
relies on object instance comparison. So even if the values are exactly the same as what is
already in the cache, the StructObjectInspector cache lookup would fail, and Hive would create
a new object inspector and add it to the cache, creating another entry that would never be
used. Plus, there is no max cache size - it's just a map that is allowed to grow as long as
values keep getting added to it.
> Some possible solutions I can think of:
> 1. Limit the size of the object inspector caches, rather than growing without bound.
> 2. Try to fix the caching to work with constant values. This would require implementing
equals() on the constant object inspectors (which could be slow in nested cases), or else
we would have to start caching constant object inspectors, which could be expensive in terms
of memory usage. Could be used in combination with (1). By itself this is not a great solution
because this still has the unbounded cache growth issue.
> 3. Disable caching in the case of constant object inspectors since this scenario currently
doesn't work. This could be used in combination with (1).

This message was sent by Atlassian JIRA

View raw message