uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: Generics in 2.8.0 getAllIndexedFS
Date Tue, 07 Jul 2015 20:19:59 GMT
On 07.07.2015, at 21:37, Marshall Schor <msa@schor.com> wrote:

> Burn reports a source incompatibility with the 2.8.0 signature for the
> getAllIndexedFS.  This is a somewhat complex issue.  Here's what I've found so far:
> 
> 1) We have 3 sets of versions of getAllIndexedFS - one in the plain CAS index
> repo, one in the JCas version of the index repo, and one directly in the JCas
> itself.  The signatures are not consistent.
> 
> 2) https://issues.apache.org/jira/browse/UIMA-4299 strongly suggests not
> returning values of <? extends XXX>.  We opted in the plain CAS version to use
>   <T extends FeatureStructure> FSIterator<T> getAllIndexedFS(...)
> and in the JCas index version:
>                                FSIterator<FeatureStructure> getAllIndexedFS(...),
> and in the JCas itself:
>   <T extends TOP>              FSIterator<T> getAllIndexedFS(Class<T>
clazz)

IMHO the signatures should be as follows:

FSIndexRepository:  FSIterator<FeatureStructure> getAllIndexedFS(...);
JFSIndexRepository: FSIterator<TOP> getAllIndexedFS(...)
JCas:               <T extends TOP> FSIterator<T> getAllIndexedFS(Class<T>
clazz);

The first two signatures do not allow type inference. When I suggested to use
a signature that allowed type inference, Marshall pointed out it could lead to
undetected type errors. I find this is a valid objection and would therefore
prefer the versions that do not allow type inference.

The third signature allows for an inferred return type, but it is bound to the
clazz argument which allows a sound type checking. A method with this signature
could probably also be defined in the JFSIndexRepository.

> 3) At first I thought that it was quite possible that the getAllIndexedFS call
> will produce an iterator which returns a mixture of subtypes of the specified
> class, or instances of FeatureStructure, which are not a subtype of the
> specified class, and would produce a ClassCastException when assigned to a
> variable of the iterator type.  I thought that FeatureStructure will be produced
> if there is no JCas cover class, but I think this is incorrect.
> There are generators for each type that generate the corresponding Java class. 
> As part of the JCas creation, the method instantiateJCas_Types is called and
> will replace the generators with JCas versions.  If no JCas cover class can be
> found for a particular type, the generator is still replaced with one for the
> first JCas cover type found in the supertype chain.  Since TOP is a built-in
> defined JCas type, even if no other types have JCas cover types, TOP will.

I have never seen the JCas produce a mix. TOP should be a safe assumption here.
The only case I think where I have seen a JCas return FeatureStructureImpl was
when it was not properly initialized - may happen after some forms of deserialization.
Calling jcas.getCas().getJCas() typically fixes this.

> 4) Current user code often has some JCas cover class as the top-most class of
> some set of Java cover objects being returned.  These are currently failing for
> some forms of getAllIndexedFS calls - the ones returning
> FSIterator<FeatureStructure>  These should be changed (back) to returning either
> <T Extends TOP> or just TOP.

I'd vote for just TOP.

> 5) I'd like to use the <T extends TOP> form, but would like other Java experts
> to weigh in to see if that would cause issues...  To be precise, I'd like the
> signature to look like:
>   <T extends TOP>  FSIterator<T> getAllIndexedFS(...) for those calls that
> don't specify the Java class in the argument.
> The pros for this is that the coder can specify a subtype of TOP if they know
> that will be returned.  The con for this is that the coder could make a mistake,
> and a runtime class-cast exception could happen, if the item returned was not a
> subtype.

I'd still vote for just TOP in cases where no Java class is in the argument - 
and for introducing signatures that have a Java class in the argument ;)

Btw. uimaFIT uses TOP in cases where no Java class is specified and <T extends TOP>
if one is specified (and I hope it does it consistently).

When using CAS instead of JCas, uimaFIT typically offers a "plain" method, e.g.
CasUtil.selectAll() which returns Collection<AnnotationFS> and a "FS" method e.g.
CasUtil.selectAllFS() which returns Collection<FeatureStructure>. The assumption
here is that annotations are more common.

Cheers,

-- Richard
Mime
View raw message