hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vihang Karajgaonkar (JIRA)" <>
Subject [jira] [Commented] (HIVE-20192) HS2 with embedded metastore is leaking JDOPersistenceManager objects.
Date Fri, 20 Jul 2018 18:25:00 GMT


Vihang Karajgaonkar commented on HIVE-20192:

{quote} The PersistenceManagerFactory object "pmf" is a static object which keeps references
of the allocated PersistenceManager in pmCache Map. That's why PersistenceManager doesn't
get GC'ed and need explicit shutdown for any exception. In this case we retry instead of closing
the thread which overwrites the pm object and leaks the old one. {quote}

I see. Thanks for the explanation.

{quote}I think, overwriting the entry by cacheThreadLocalRawStore doesn't cause any leak,
because, it overwrites with thread local rawStore which is active in this thread. If the thread
local rawStore is changed, it means, the older one was already shutdown gracefully before
re-create. Also, threadRawStoreMap shouldn't pile up as we use the same thread id. {quote}

I think you are right. Looks like the model of cleaning up is optimistic in the sense in case
the thread is reused, {{Hive#getInternal}} method does some checks to make sure if we can
reuse this threadlocal rawstore and cleans it up in case the owner is different or the config
is not compatible. So looks like we are good in case of thread re-use because the object which
is being overwritten in the {{ThreadWithGarbageCleanup.threadRawStoreMap}} is either replaced
with the same object or when the previous one was closed. So that code path looks good to
me. This is all very tricky business and I hope there is no other code path which is still
leaking the rawstore. 

This patch looks good to me. +1

> HS2 with embedded metastore is leaking JDOPersistenceManager objects.
> ---------------------------------------------------------------------
>                 Key: HIVE-20192
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.0.0, 3.1.0, 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: HiveServer2, pull-request-available
>             Fix For: 4.0.0
>         Attachments: HIVE-20192.01.patch
> Hiveserver2 instances where crashing every 3-4 days and observed HS2 in on unresponsive
state. Also, observed that the FGC collection happening regularly
> From JXray report it is seen that pmCache(List of JDOPersistenceManager objects) is occupying
84% of the heap and there are around 16,000 references of UDFClassLoader.
> {code:java}
> 10,759,230K (84.7%) Object tree for GC root(s) Java Static org.apache.hadoop.hive.metastore.ObjectStore.pmf
> - org.datanucleus.api.jdo.JDOPersistenceManagerFactory.pmCache ↘ 10,744,419K (84.6%),
1 reference(s)
>   - j.u.Collections$SetFromMap.m ↘ 10,744,419K (84.6%), 1 reference(s)
>     - {java.util.concurrent.ConcurrentHashMap}.keys ↘ 10,743,764K (84.5%), 16,872 reference(s)
>       - ↘ 10,738,831K (84.5%), 16,872
>         ... 3 more references together retaining 4,933K (< 0.1%)
>     - java.util.concurrent.ConcurrentHashMap self 655K (< 0.1%), 1 object(s)
>       ... 2 more references together retaining 48b (< 0.1%)
> - org.datanucleus.api.jdo.JDOPersistenceManagerFactory.nucleusContext ↘ 14,810K (0.1%),
1 reference(s)
> ... 3 more references together retaining 96b (< 0.1%){code}
> When the RawStore object is re-created, it is not allowed to be updated into the ThreadWithGarbageCleanup.threadRawStoreMap
which leads to the new RawStore never gets cleaned-up when the thread exit.

This message was sent by Atlassian JIRA

View raw message