uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: Design choices for changing type systems with loaded JCas classes [was Re: UIMAv3 & WebAnno}
Date Fri, 05 Jan 2018 19:05:08 GMT
On 05.01.2018, at 17:16, Marshall Schor <msa@schor.com> wrote:
> Based on Web Annot's use case, I'm thinking thorough alternatives.

"WebAnno" ;)

> One way to support this would be to have the user code tell the UIMA framework
> that no reachable instances of JCas classes exist; the user would be responsible
> for guaranteeing this.

There may be no way for the user code to know if this is the case or not or to 
enforce this to be the case. 

> The other choice would be to not support this (because of the inherent dangers)
> and instead require users having multiple type systems with JCas classes
> specifying features only in some versions of those type systems, first load the
> JCas classes with the feature-maximal versions of the types.
> I think I favor the 2nd approach, as it is much safer. 
> What do others think we should do?

The current line of thinking seems to assume that:

1) a type system definition is loaded (maybe from an XML file)
2) a CAS is created using the TSD
3) the JCas classes are loaded and are initialized according to the TSD

The suggestion to "first load a feature-maximal version of the types" seems
to be following that line. I.e. the TSD loaded in 1) should cover all
the features also covered by the JCas classes.

How about a slightly different approach:

1) a type system definition is loaded (maybe from an XML file)
1a) the JCas classes are loaded and their definitions are merged with the
2) a CAS is created using the merged TSD
3) the JCas classes are initialized with the now feature-maximal type system

An error would/should be thrown if in step 1a the JCas classes
and the TSD are inherently incompatible. 

In this case, the JCas classes would be an additional source of type system
information. Thinking this further, one could even initialize a CAS without
providing any TSD, simply by having UIMA inspect the available JCas classes
(e.g. through classpath scanning or by providing the framework with a list
of classes). To complete this, the JCas classes could be enhanced with
Java annotations to carry any information included in TSDs which is currently
not included in a machine-readable way in the JCas classes, e.g. type and
feature description text. As such, a set of suitably annotated JCas classes
could be converted to a TSD XML and vice versa.

The above assumes that JCas classes are loaded and initialized eagerly, but 
probably it could be adapted to a situation where the classes are loaded lazily.


-- Richard

View raw message