hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Dere (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-18252) Limit the size of the object inspector caches
Date Mon, 11 Dec 2017 22:27:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286676#comment-16286676
] 

Jason Dere commented on HIVE-18252:
-----------------------------------

[~ashutoshc] can you review?

> Limit the size of the object inspector caches
> ---------------------------------------------
>
>                 Key: HIVE-18252
>                 URL: https://issues.apache.org/jira/browse/HIVE-18252
>             Project: Hive
>          Issue Type: Bug
>          Components: Types
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>         Attachments: HIVE-18252.1.patch
>
>
> Was running some tests that had a lot of queries with constant values, and noticed that
ObjectInspectorFactory.cachedStandardStructObjectInspector started using up a lot of memory.
> It appears that StructObjectInspector caching does not work properly with constant values.
Constant ObjectInspectors are not cached, so each constant expression creates a new constant
ObjectInspector. And since object inspectors do not override equals(), object inspector comparison
relies on object instance comparison. So even if the values are exactly the same as what is
already in the cache, the StructObjectInspector cache lookup would fail, and Hive would create
a new object inspector and add it to the cache, creating another entry that would never be
used. Plus, there is no max cache size - it's just a map that is allowed to grow as long as
values keep getting added to it.
> Some possible solutions I can think of:
> 1. Limit the size of the object inspector caches, rather than growing without bound.
> 2. Try to fix the caching to work with constant values. This would require implementing
equals() on the constant object inspectors (which could be slow in nested cases), or else
we would have to start caching constant object inspectors, which could be expensive in terms
of memory usage. Could be used in combination with (1). By itself this is not a great solution
because this still has the unbounded cache growth issue.
> 3. Disable caching in the case of constant object inspectors since this scenario currently
doesn't work. This could be used in combination with (1).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message