uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Eckart de Castilho (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-3747) Memory management problem with compressed binary deserialization
Date Tue, 15 Apr 2014 20:50:15 GMT
Richard Eckart de Castilho created UIMA-3747:

             Summary: Memory management problem with compressed binary deserialization
                 Key: UIMA-3747
                 URL: https://issues.apache.org/jira/browse/UIMA-3747
             Project: UIMA
          Issue Type: Bug
          Components: Core Java Framework
    Affects Versions: 2.4.2SDK
            Reporter: Richard Eckart de Castilho
            Assignee: Marshall Schor

We think we stumbled across a memory management problem with the new compressed binary serialization
when a CAS is reset/reused in a loop, e.g. in the uimaFIT SimplePipeline. When we use form
6, we consistently run into out-of-memory situations. Finally, we took the time to do a heap
dump analysis.

We found a huge TypeSystemImpl instance in the heap (~450MB). What makes it huge is the field
that in our case contains 1000+ entries, each of them using apparently using a TypeSystemImpl
as key.

It looks like typeSystemMappers is never reset when a CAS is reused. My current theory is,
that it should be reset when CAS.reset() is called, otherwise type systems accumulate there
when the binary deserialization is used to repeatedly load data into a CAS in a loop that
is resetting and reusing the CAS.

This message was sent by Atlassian JIRA

View raw message