uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Lally" <ala...@alum.rpi.edu>
Subject Re: Default Result Specifications too complicated?
Date Tue, 12 Jun 2007 16:34:49 GMT
On 6/12/07, Thilo Goetz <twgoetz@gmx.de> wrote:
> no, this is not an option.

So you vote against the option... that is fine, it doesn't mean it
isn't an option *to consider*.

> We have users who use the capabilityLanguageFlow
> in 1.4 in ways that will break in 2.x.  I don't want them to migrate to Apache
> UIMA, happily use that flow and then have things break in subtle ways.  We
> either fix it so it's backwards compatible, or we rename it so people don't
> think it's the same.
>

We said in the 2.0 "what's new" - there have been changes in the
ResultSpecification, we don't guarantee applications that use them
will be backwards compatible.  We can add to that the same thing for
languageCapabilityFlow.  Sometimes, things change between 1.x and 2.x.
 Removing/renaming capabilityLanguageFlow completely just unilaterally
breaks everybody just to avoid confusing a handful (or less) of users
that might have been effected.

> >
> > While adding a SimpleStepWithResultSpec is a possibility for backwards
> > compatibility, I'm really not that happy with that idea going forward,
> > since it encourages people to build applications that rely on Result
> > Specifications.  I think Result Specifications should only be a
> > performance optimization.  Since only a handful of annotators in the
> > world pay attention to their Result Spec at all, I think it's not a
> > very good idea for applications to rely on them.  In your example, if
> > the second annotator produces tokens will this be just a performance
> > problem or will the application actually break?
>
> The application will break.
>

I recommend we make it explicit that applications cannot expect
annotators to observe the Result Specification. If an application is
going to do this then it might as well rely on the annotator to check
the CAS for existing Tokens before creating new ones, as I suggested.
This one obscure use case can be handled in an application-specific
way, rather than adding complexity to the framework.

Result Specs are complicated enough (the original point of this issue)
if they are deterministically determined based on the descriptors.  If
FlowControllers can set them arbitrarily the poor users will have no
hope to understand why they aren't getting the results they expect.

-Adam

Mime
View raw message