uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joern Kottmann <kottm...@gmail.com>
Subject Re: opinion on degree of backwards compatibility for Uima V3 experiment
Date Wed, 07 Sep 2016 11:46:09 GMT
Hello all,

at my work place we use UIMA mostly with custom code to load data into a
pipeline and store its results,
therefore we don't depend at all on the UIMA serialization formats. And
changing them, or adding new ones which
are incompatible wouldn't be an issue at all. Also the existing code can be
ported to work with UIMA 3.

I really hope we can get UIMA 3 into a shape where it is easier to use with
todays requirements (e.g. with Hadoop)
and possibilities.

I personally think that the effort to create the next overhauled version
shouldn't be limited in anyway by backward compatibility.
For me it is a good solution if there is some help with migrating things to
UIMA 3 (e.g. a guide which explains what to do)
and maybe maintaining UIMA 2 for a while in parallel (e.g. fixes of very
urgent/critical bugs).

Jörn

On Fri, Sep 2, 2016 at 7:56 PM, Richard Eckart de Castilho <rec@apache.org>
wrote:

> See comment at end of mail.
>
> On 02.09.2016, at 15:18, Marshall Schor <msa@schor.com> wrote:
> >
> > To go from an ID to an FS is not generally possible, because normally,
> the
> > framework doesn't keep this association.  There are exceptions though,
> the main
> > ones being:
> >
> > a) If you use low level CAS Apis to create FSs, the API returns the ID,
> which
> > means, that a GC that happens right after the API returns would garbage
> collect
> > the FS because at that point, nothing is "holding on" to any reference
> (it's not
> > in any index).  To prevent this, the low level create FS methods add the
> FS to a
> > map which goes from ID -> FS, and thus "holds onto" the FS, preventing
> Garbage
> > collection.
> >
> > b) Another case where this happens is when PEARs are used; in this case
> the FSs
> > involved with PEAR "trampoline" FSs end up being in similar maps.
> >
> > Both of these approaches of course disable a feature of V3 - namely, that
> > unrefererenced FSs can be garbage collected.
> >
> > ...
> >
>
> > There is an API in the V3 CASImpl, getFsFromId(int)  and also
> > getFsFromId_checked(int), which retrieves the associated FS, given the
> ID, or
> > returns null (or throws an exception) if it isn't in the table.  Most FSs
> > created normally, won't be in the table.
>
> Can we do this? -> As soon as an FS has been added to an index or is being
> referenced from another FS, its ID should be resolvable to the respective
> FS.
>
> When an FS is in an index or being referred by another FS, it cannot be
> garbage collected anyway. The CAS could maintain a lookup using weak
> references to provides a central place to look up such FSes via their IDs
> without preventing garbage collection.
>
> WebAnno remembers the ID of every FS rendered on screen. When the user
> makes an action, we load the CAS from disk and then look up the ID to
> retrieve the FS. We do not keep the CAS in memory all the time. If we would
> have to scan the whole CAS for the FS with a given ID, it would have
> probably a serious performance impact.
>
> Cheers,
>
> -- Richard

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message