uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Problem with creating type systems with extra features merged from JCas class definitions - needs some creative thinking
Date Fri, 19 Jan 2018 15:05:41 GMT
Taking your last point, a bit further.

The reason for doing all this in the first place was to have the JCas class
setup for feature offsets work when subsequent type systems implemented
additional features, which were already in the JCas class.

The "trick" used was to merge in to the type system any features defined in the
JCas class, but not in the type system, during type system commit.

A better thing to do, maybe, would be to put these extra features into the type
system, but not as features, but as only something used when setting up the
"offsets", so the offsets get set properly. 

Advantages: things like serialization / deserialization which depend on the
exact type system spec continue to work as before; serialization would serialize
as if the extra features were not present in the type system, so interfacing
with other systems would continue to work unchanged.
Attempting to access a feature not in the type system could be made to give an
error.

This seems the most compatible way to introduce this capability - so I'll see if
I can figure out how to do something like this.

-Marshall


On 1/19/2018 4:58 AM, Richard Eckart de Castilho wrote:
> On 19.01.2018, at 03:37, Marshall Schor <msa@schor.com> wrote:
>> The trouble with this is that it breaks several serializations (where the type
>> system is required to be "known", for example Cas Complete serialization),
>> because the layout of the serialized format is with respect to the type system
>> which created the serialization.
>>
>> I'm not sure what a reasonable solution to this issue is; thoughts welcomed :-)
> Several thoughts:
>
> - The CasCompleteSerializer (as opposed to the CASSerializer) includes a CASMgrSerializer.
>   We use the CASMgrSerializer in CasIOUtils to optionally reinitialize the CAS with a
type
>   system different from what it had at creation time via setupCasFromCasMgrSerializer().

>   Doesn't that contain sufficient information about the type system to decode it?
>
> - Allow disabling the augmentation of the CAS from JCas classes, e.g. for people
>   that need full control over the type system in the CAS and do not want to use
>   JCas with that CAS (and also cannot easily use classloader isolation).
>   The idea was brought up earlier already when we observed that UIMA v2 had an
>   option to disable JCas.
>
> - It may be possible to mark features which were automatically obtained from 
>   the JCas and to not take these into account when deserializing with CasComplete.
>   It would cause an inconsistency though: when loading the data into a CAS and
>   storing the data again (both with CasComplete), the type systems in the input
>   and output would differ.
>
> Cheers,
>
> -- Richard
>
>


Mime
View raw message