uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Design choices for changing type systems with loaded JCas classes [was Re: UIMAv3 & WebAnno}
Date Tue, 09 Jan 2018 21:53:46 GMT
I did an initial implementation, ignoring Pear files.

I think the "feature expansion" when loading PEAR-classpath specified JCas
classes can't reasonably be done (because by the time you lazily get around to
loading these, the type system is committed).

So, I plan to have the pear loading path operate like before, with no feature
expansion.

I kind of doubt this will be a real issue in actual practice (he said hopefully
:-) ).

Still need to fix up some test cases, but it's looking promising...

-Marshall


On 1/8/2018 2:47 PM, Marshall Schor wrote:
> In working out the details, the following difficulty emerges:
>
> In the general case, a pipeline is associated with a class loader (used to load
> JCas classes).
> When the pipeline contains "PEARs", each pear can specify it's own class loader,
> and therefore, it's own set of JCas classes.
>
> So, at type system commit time, with this proposal, it would be necessary to
> find all of the class loaders that Pears might be using.  This unfortunately is
> not possible in general, because the Pears are associated with a particular
> pipeline, and you can load a type system and create a CAS without referring to a
> particular pipeline. 
>
> In the current implementation, the presence of a Pear in the pipeline is
> discovered (if and) when the pear is entered for the first time, and at that
> time (lazily) the loading of that Pear's JCas classes happens.
>
> Various limitations are possible, I suppose (e.g., not allowing a Pear version
> of JCas class to have new features, for example).
>
> Still thinking about this...
>
> -Marshall
>
>
> On 1/8/2018 10:16 AM, Marshall Schor wrote:
>> After a lot of thought, here's a proposal, along the lines Richard suggests:
>>
>> The basic idea is to have the JCas classes, if they exist for some type, augment
>> that type with features defined only in the JCas class.
>>
>> This augmentation would be done at type system commit time, and would really
>> modify the type system being committed to have the extra features.  Because the
>> type system would be modified to include these extra features, the Feature
>> Structures made with these "augmented" types would be larger (because they would
>> have slots for these features).  This insures that subtypes' features won't
>> overlap / collide with the expanded features.
>>
>> I'll work out the details, and see if I can make this change.
>>
>> -Marshall
>>
>>
>> On 1/5/2018 2:05 PM, Richard Eckart de Castilho wrote:
>>> On 05.01.2018, at 17:16, Marshall Schor <msa@schor.com> wrote:
>>>> Based on Web Annot's use case, I'm thinking thorough alternatives.
>>> "WebAnno" ;)
>>>
>>>> One way to support this would be to have the user code tell the UIMA framework
>>>> that no reachable instances of JCas classes exist; the user would be responsible
>>>> for guaranteeing this.
>>> There may be no way for the user code to know if this is the case or not or to

>>> enforce this to be the case. 
>>>
>>>> The other choice would be to not support this (because of the inherent dangers)
>>>> and instead require users having multiple type systems with JCas classes
>>>> specifying features only in some versions of those type systems, first load
the
>>>> JCas classes with the feature-maximal versions of the types.
>>>>
>>>> I think I favor the 2nd approach, as it is much safer. 
>>>>
>>>> What do others think we should do?
>>> The current line of thinking seems to assume that:
>>>
>>> 1) a type system definition is loaded (maybe from an XML file)
>>> 2) a CAS is created using the TSD
>>> 3) the JCas classes are loaded and are initialized according to the TSD
>>>
>>> The suggestion to "first load a feature-maximal version of the types" seems
>>> to be following that line. I.e. the TSD loaded in 1) should cover all
>>> the features also covered by the JCas classes.
>>>
>>> How about a slightly different approach:
>>>
>>> 1) a type system definition is loaded (maybe from an XML file)
>>> 1a) the JCas classes are loaded and their definitions are merged with the
>>>     TSD
>>> 2) a CAS is created using the merged TSD
>>> 3) the JCas classes are initialized with the now feature-maximal type system
>>>
>>> An error would/should be thrown if in step 1a the JCas classes
>>> and the TSD are inherently incompatible. 
>>>
>>> In this case, the JCas classes would be an additional source of type system
>>> information. Thinking this further, one could even initialize a CAS without
>>> providing any TSD, simply by having UIMA inspect the available JCas classes
>>> (e.g. through classpath scanning or by providing the framework with a list
>>> of classes). To complete this, the JCas classes could be enhanced with
>>> Java annotations to carry any information included in TSDs which is currently
>>> not included in a machine-readable way in the JCas classes, e.g. type and
>>> feature description text. As such, a set of suitably annotated JCas classes
>>> could be converted to a TSD XML and vice versa.
>>>
>>> The above assumes that JCas classes are loaded and initialized eagerly, but 
>>> probably it could be adapted to a situation where the classes are loaded lazily.
>>>
>>> Cheers,
>>>
>>> -- Richard
>>>
>>>
>


Mime
View raw message