uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Resolved] (UIMA-4820) uv3 Supporting Delta deserialization requires preserving simulated heap addresses
Date Tue, 06 Dec 2016 15:21:58 GMT

     [ https://issues.apache.org/jira/browse/UIMA-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Marshall Schor resolved UIMA-4820.
    Resolution: Fixed

> uv3 Supporting Delta deserialization requires preserving simulated heap addresses
> ---------------------------------------------------------------------------------
>                 Key: UIMA-4820
>                 URL: https://issues.apache.org/jira/browse/UIMA-4820
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>             Fix For: 3.0.0SDKexp
> UIMA supports various formats of delta deserialization, which is when a serialization
is done (to, for example, a remote service), and then a delta serialization returns just the
changes back to the original CAS.  
> There are two approaches used to get the set of FSs to serialize.  
> * One way, used for plain binary and form4 compressed, scans the "heap" sequentially,
and sends all those FSs, including potentially FSs that are not "reachable".  
> * The other way is to use the indexes plus following reference chains to locate all "reachable"
FSs, and only send those.  This is used for XCAS, XMI, JSON, and Form6 compressed.
> In V3, the plain and form4 serialization need to preserve simulated heap "addresses"
(per CAS) for the FSs sent in order to enable future delta deserializations to have the proper
"heap" addresses; it may not recalcuate this from the CAS FS contents, because intervening
GCs may have garbage collected some unreachable FSs..  
> Furthermore, plain and form4 non-delta deserialization where a delta serialization is
to follow, must likewise preserve these simulated heap addresses (per CAS), for all deserialized
> This preservation is needed to insure that the simulated "addresses" of FSs are constant,
even if unreachable FSs are reclaimed.  In practice, this means that various maps involving
simulated heap "addresses" need to be retained and not recreated.
> Because they are retained, their storage needs to be released when no longer needed:
 at CAS Reset time, after a services delta deserializer has completed deserializing (potentially
multiple) delta CASes, or when a new non-delta serialization is started (this will re-create
this storage).  For services use, we may add a new API to release this storage; the service
would call it after all delta deserializations for this CAS have been received (this use case
is supporting having multiple remotes working on a common CAS and having their delta results
merged back into the original CAS).  

This message was sent by Atlassian JIRA

View raw message