hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11499) Datanucleus leaks classloaders when used using embedded metastore with HiveServer2 with UDFs
Date Fri, 04 Sep 2015 00:51:45 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730114#comment-14730114
] 

Thejas M Nair commented on HIVE-11499:
--------------------------------------

[~vgumashta] Looks like the patch needs rebase for master.
 # I think it would be better to create an ObjectStore function that does the cleanup, so
that we limit the DN logic to that class.
 # LOG.info(e);  This can be change to LOG.warn("Failed to clean datanucleus classloader cache");
 # Orphan Jira references will clutter the code over time. I assume it was added to show which
jira added that change, but we have git blame for that, and over time the code would change
and these references would be very unreliable. IMO, Jira references are useful if they are
needed to provide more context in a comment. eg "DataNucleus caches classloaders in NucleusContext.classLoaderResolverMap
. See references in profiler snapshot in HIVE-11499" . Such references are not as likely to
be out of sync.




> Datanucleus leaks classloaders when used using embedded metastore with HiveServer2 with
UDFs
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-11499
>                 URL: https://issues.apache.org/jira/browse/HIVE-11499
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2, Metastore
>    Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.1.1, 1.2.1
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>         Attachments: HIVE-11499.1.patch, HS2-NucleusCache-Leak.tiff
>
>
> When UDFs are used, we create a new classloader to add the UDF jar. Similar to what hadoop's
reflection utils does(HIVE-11408), datanucleus caches the classloaders (https://github.com/datanucleus/datanucleus-core/blob/3.2/src/java/org/datanucleus/NucleusContext.java#L161).
JDOPersistanceManager factory (1 per JVM) holds on to a NucleusContext reference (https://github.com/datanucleus/datanucleus-api-jdo/blob/3.2/src/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L115).
Until we call  NucleusContext#close, the classloader cache is not cleared. In case of UDFs
this can lead to permgen leak, as shown in the attached screenshot, where NucleusContext holds
on to several URLClassloader objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message