uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-5554) Strange exception when trying to get JCas FS class through reflection
Date Mon, 11 Sep 2017 14:30:01 GMT

    [ https://issues.apache.org/jira/browse/UIMA-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161351#comment-16161351
] 

Marshall Schor commented on UIMA-5554:
--------------------------------------

Thanks for the great questions, Richard :-)

I think most of these questions revolve around JCas and non-JCas usage in UIMA.  
UIMA originally had only non-JCas usage (JCas was added later).  The non-JCas model of access
to the CAS is useful today when you're writing "generic" annotators that need to work with
multiple type systems, where you don't know the type(s) and feature(s) ahead of time.

It's somewhat like Java's "reflection" approach - you do some calls in the typeSystemInit
method to get some special values into local fields, which you then use when accessing UIMA
types and features.  

The JCas, in contrast, adds classes having class names equivalent to the UIMA types, and method
names corresponding to the features, and you use these in your code.  But, of course, using
these means your code is specific to those types and features.

The bridge that exists between these is somewhat flexible.  Even when JCas is being used,
the non-JCas access capabilities continues to exist along side it.  So, in a particular pipeline
/ CAS, there may be UIMA types for which no JCas classes exist, or UIMA types have additional
"features", for which no JCas getters/setters exist; these can be accessed (if needed) using
the non-JCas approach.

A use case which motivates this scenario is the type "merging" that UIMA does when given type
system descriptors coming from annotator descriptors - that merging might "add" additional
features to a type (say, needed by a particular annotator you're including in the pipeline),
or even add additional Types.   That Annotator might be a non-JCas annotator, and doesn't
define any JCas classes.  So there might not be any getter/setter for these additional fields
or types.

That is the state of things in both V2 and V3.

The implementation details in V3 differ, because the actual Feature Structure instances are
represented as instances of some JCas class.  In the case where the style is non-JCas, there
still are the "built-in" JCas classes, like TOP and Annotation. When you define a UIMA type,
say Foo, it always inherits from some supertype (TOP, if non other).  If no JCas definition
exists (in v3) for an instance, it's most specific supertype JCas class is used to instantiate
it.

With that background, let's address the questions, maybe in reverse order....

1) Why are the types not simply "installed" when the JCas class is loaded and initialized?
 (Installed means the corresponding UIMA types are installed).  

JCas classes are normally loaded an initialized as part of type system commit, after the type
system has been committed.
The exception of course, is that any user code running before type system commit, might make
a reference to a JCas type; the first such reference would cause Java to load and initialize
the JCas class.  Also, a user might write code like Class.forName(...) to force loading of
a class.  V3 reports errors if these other kinds of loading/initializing are done before type
system commit.

The reason that UIMA types are not "installed" when a JCas type is loaded in the two exception
cases above, is because the details of the UIMA types are not available at that time (because
the type system hasn't been gathered from all the annotators in the pipeline and merged, and
committed). The UIMA types could be supersets of what this particular JCas implementation
defines (see, for example the use case above where some non-JCas Annotator used additional
fields).  

2) Is there a new concept of installing/committing types in V3?  

In both v2 and v3, type systems need to be assembled from annotator descriptors in a pipeline,
merged, and committed, before being used.  UIMA uses this concept to allow efficiency in accessing.
 This is, admittedly, a trade-off, versus an approach which allows a more dynamic (looking
up more information on each access), but this trade-off was made early in the design of UIMA.

In V3, an additional "ordering" requirement is present, requiring that the UIMA type system
be assembled, merged, and committed, before any JCas classes are *initialized*.  This, again,
is an efficiency tradeoff, and enables feature access to be compiled into very efficient code
that is modern-cpu-design-cache-loading efficient.  New error messages were added to detect
when this constraint is being violated.

3) In v2 it was possible to have any number of type systems and different CASes initialized
with different type systems - and if different classloaders were used, even with different
JCas classes.  

This is also the case in V3, as long as the merged/committed type system is available before
any JCas classes are installed.  If you are using JCas with different class loaders, they
can have associated different type systems.    This was done to support, for example, running
"servlets" which each have their own isolating class loaders, each servlet running perhaps
a different UIMA pipeline.

4) It is possible to re=initialize a CAS with a new type system even after one has already
been committed.  

This is also true in V3.  Users of this typically are using the non-JCas APIs, because the
reinitialization could install any type system.  The V3 implementation insures that the built-in
types and their JCas implementations are always available, and have the same feature offsets.
 So, it is expected that most use cases doing this kind of thing will continue to work.

5) The type system for a CAS is committed, per CAS/classloader.  

This should still be true.  

I hope this explains this a bit better.  

> Strange exception when trying to get JCas FS class through reflection
> ---------------------------------------------------------------------
>
>                 Key: UIMA-5554
>                 URL: https://issues.apache.org/jira/browse/UIMA-5554
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 3.0.0SDK-beta
>            Reporter: Richard Eckart de Castilho
>
> I am trying to get a class object for a JCas FS type using reflection:
> {noformat}
> Class.forName(typeName);
> {noformat}
> However, it produces this strange error.
> {noformat}
> java.lang.ExceptionInInitializerError
> 	at java.lang.Class.forName0(Native Method)
> 	at java.lang.Class.forName(Class.java:264)
> ...
> Caused by: org.apache.uima.cas.CASRuntimeException: A JCas class field "sofa" is being
initialized by non-framework (user) code before Type System Commit for a type system with
a corresponding type. Either change the user load code to not do initialize, or to defer it
until after the type system commit.
> 	at org.apache.uima.cas.impl.TypeSystemImpl.getAdjustedFeatureOffset(TypeSystemImpl.java:2575)
> 	at org.apache.uima.jcas.cas.AnnotationBase.<clinit>(AnnotationBase.java:71)
> 	... 27 more
> {noformat}
> Is it considered harmful to try getting a class object for a JCas FS class?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message