uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Design choices for changing type systems with loaded JCas classes [was Re: UIMAv3 & WebAnno}
Date Sun, 07 Jan 2018 05:06:02 GMT
Let's not give up yet :-)

I'm thinking of an approach now which would cover some of the cases, perhaps
including WebAnno's.

It would work like this
  (follows your earlier idea that "extra" features in JCas be pre-setup
   to work if and when a type system defining those features is used
   with this JCas definition):

1) Some type system (perhaps not having all the features for a type specified)
would be created.
2) A CAS would be instantiated from this, causing the associated JCas classes to
be loaded

Let's assume those JCas classes include class T with features f1, f2, f3, and f4.
If the loaded type system only had feature f1 and f2, the code right now reports
that f3 and f4 aren't in the type system, but continues.  It also initializes
the feature offsets for those in the JCas class to -1, so (accidental) refs to
these throw exceptions.

Now, instead of that, let's suppose the feature offsets get initialized to the
next sequential slot(s).  Accessing these with this type system at run time
would still throw errors, as it should, because the slot arrays are only
allocated for the number of slots defined in the type.

However, a subsequent time a different type system defining f1, f2, f3 and f4
for type T is in use, would find the loaded JCas class would fit just right.

This only works when the different type definitions for type T keep the slots in
the same order, with no omissions.
(The JCas initialization for a new type system checks this.)  For example, valid
definions for type T would be
  T with features (none)
  T with features f1
  T with features f1, f2
  T with features f1, f2, f3
  T with features f1, f2, f3, f4
  T with features f1, f2, f3, f4, f5 ...

You could not have T with features f2, f3  (skipping f1).  And the feature
definitions would need to be in this exact order (it would not work for T with
features f2, f1).

Does this cover the case(s) in WebAnno?

Re: giving up on using JCas with WebAnno.  The JCas was always a somewhat more
"static" / "compile-time" description of types, than the non-JCas APIs, which
had generic methods which had arguments like Type(s) and Feature(s).  For
applications where the app really had no idea about the types, the JCas probably
is not a good fit. 

There's a hybrid approach - using the JCas for the "top of the type hierarchy" -
so for example, if you had subtypes of Annotation (not known at compile time for
the app), you could assign instances of those to Annotation, and then use the
getBegin() etc. APIs, while also using the non-JCas APIs to access the rest of
the features (as needed), based on what types are actually being used.  In this
case the app has no knowledge (at compile time) of the types (beyond the fact
that some of them are subtypes of Annotation).

On 1/6/2018 9:18 AM, Richard Eckart de Castilho wrote:
> On 06.01.2018, at 00:10, Marshall Schor <msa@schor.com> wrote:
>> Here's the specifics:  if the maximal-type system for type T has features f1,
>> f2, f3, f4, f5, and the JCas class defines all these features, then the load of
>> that JCas class will bind those.
>> A subsequent switch to a type system with f1, f2, will work.
>> But a subsequent switch to a type system with f1, f3 won't work (because the
>> offset for f3 is set to, say "3", and the length of the feature slots allocated
>> is only 2.
>> To work around these, the application needs to use class loader isolation to
>> force reloading of the JCas classes.
> Hm, I am wondering if this problem is actually already present in v2 and for
> some reason I just never hit it.
> I am not sure if/how I could sensibly use classloader isolation in WebAnno.
> JCases are passed around all the time and operations on them happen in many
> locations in the code. And not only that - FeatureStructures extracted from
> the JCas are also passed around a lot (although always limited to a single
> web request so that the FSes do not become invalid).
> It would be very tricky to determine when to reload JCas classes. Reloading
> the JCas classes every time an operation of the JCas is executed probably
> would introduce a serious overhead - and it would (probably) break the
> FeatureStructures that have already been extracted from a CAS and are
> being passed around.
> Maybe I could try isolating those places where legacy data is loaded...
> I suppose, the easiest and safest would to be to give up on using JCas
> entirely in WebAnno and use only the CAS API - which might also actually
> be slower than JCas in UIMAv3. I might end up manually writing wrapper
> classes for certain annotation types that internally use the CAS API.
> Best,
> -- Richard

View raw message