uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Design choices for changing type systems with loaded JCas classes [was Re: UIMAv3 & WebAnno}
Date Thu, 11 Jan 2018 15:24:15 GMT

Now I understand what you meant by JCas first :-).

I see how it could solve the problem for a single type, in isolation (without
consideration for super/subtypes).

However, if the type had subtypes, then things break down.  This is because of
the following constraints.
Assume type T and subtype of T: TS1 and another subtype of T: TS2

The nature of inheritance requires that TS1 and TS2 both contain all the
features of T.
Because instances of TS2 and TS1 could be cast to T, the JCas for T could be
used to retrieve the (common -to-TS1&TS2) features.
This constrains the "offsets" of those to be the same in TS1 and TS2.
This in turn, implies that the offsets for T come before the feature slots for
TS1 and TS2 (there could be different numbers of those features in TS1 and TS2).

So building such a type system would work nicely, so far.

Now consider a new type system with type T' (named the same as T, but having an
extra feature, not in the JCas).
TS1 and TS2 now have an extra feature being inherited, which must occupy the
same slot.
But the offsets for TS1 and TS2's own features were assigned, already, following
the features for the old definition for T.

So now things are no longer working. 



On 1/10/2018 3:15 PM, Richard Eckart de Castilho wrote:
>> Some use cases with comments:
>> 1) Type T loaded with features f1, f2, f3,  JCas loaded with f1, f2, f3
>> Followed by: Type T loaded with features f1, f3.
>> This causes at the 2nd Type T commit time, the augmentation of type T with
>> feature f2.
>> But, the (current) impl just does an "addFeature" API call.  The result is that
>> without extra work, the features in the type system will be ordered as f1, f3,
>> f2.  And the assigned offsets could be different. 
>> To fix this, the algorithm which assigns offsets will need to see if the
>> corresponding JCas class (if any) has offsets already assigned, and try to use
>> those.
> This is why I suggested to use "JCas first": the order of the features should be
> defined by the JCas (i.e. they come first) while features defined in other TSDs
> get appended after that.
>> 2) Type T having supertype TS; Type T has 1 feature, f1, JCas for Type T has 1
>> feature f1.  TS has no features, no JCas for TS or JCas for TS has no features. 
>> Followed by: Type TS is loaded, having one feature (not in the JCas if there is
>> one for TS).
>> This causes the features for type T (which includes all the features of its
>> supertype), to have offsets shifted down.
>> For example if T has feature f1 with offset "3",  it would now have offset "4"
>> (accounting for the space taken by the TS feature).
> I believe this could also be resolved by using "JCas first": first all the slots
> for features defined in any of the JCas classes in the inheritance hierarchy
> are assigned and afterwards the features define in other TSDs are appended.
> I believe that by using "JCas first", the slots for the JCas class features
> are always fixed, independent of what other TSDs they are combined with.
> Does "JCas first" now sound more sensible?
> ... or maybe I am misunderstanding something basic (which is entirely possible).
> Cheers,
> -- Richard

View raw message