uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: opinion on degree of backwards compatibility for Uima V3 experiment
Date Fri, 02 Sep 2016 17:56:26 GMT
See comment at end of mail.

On 02.09.2016, at 15:18, Marshall Schor <msa@schor.com> wrote:
> 
> To go from an ID to an FS is not generally possible, because normally, the
> framework doesn't keep this association.  There are exceptions though, the main
> ones being:
> 
> a) If you use low level CAS Apis to create FSs, the API returns the ID, which
> means, that a GC that happens right after the API returns would garbage collect
> the FS because at that point, nothing is "holding on" to any reference (it's not
> in any index).  To prevent this, the low level create FS methods add the FS to a
> map which goes from ID -> FS, and thus "holds onto" the FS, preventing Garbage
> collection.
> 
> b) Another case where this happens is when PEARs are used; in this case the FSs
> involved with PEAR "trampoline" FSs end up being in similar maps.
> 
> Both of these approaches of course disable a feature of V3 - namely, that
> unrefererenced FSs can be garbage collected.
> 
> ...
> 

> There is an API in the V3 CASImpl, getFsFromId(int)  and also
> getFsFromId_checked(int), which retrieves the associated FS, given the ID, or
> returns null (or throws an exception) if it isn't in the table.  Most FSs
> created normally, won't be in the table.

Can we do this? -> As soon as an FS has been added to an index or is being referenced from
another FS, its ID should be resolvable to the respective FS.

When an FS is in an index or being referred by another FS, it cannot be garbage collected
anyway. The CAS could maintain a lookup using weak references to provides a central place
to look up such FSes via their IDs without preventing garbage collection.

WebAnno remembers the ID of every FS rendered on screen. When the user makes an action, we
load the CAS from disk and then look up the ID to retrieve the FS. We do not keep the CAS
in memory all the time. If we would have to scan the whole CAS for the FS with a given ID,
it would have probably a serious performance impact.

Cheers,

-- Richard
Mime
View raw message