uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Lally" <ala...@alum.rpi.edu>
Subject Re: Default Result Specifications too complicated?
Date Fri, 15 Jun 2007 15:57:50 GMT
To update this thread:  We've determined that the particular use case
we know about that was relying on this feature of the
capabilityLanguageFlow could be addressed by changing the annotator
code if necessary, to check for existence of Tokens in the CAS before
creating additional Tokens.  Of course it isn't ideal that the
annotator code would have to change, but at least it is a possibility
since other options aren't ideal either.

The questions we still need to reach agreement on are:

(1) Should we change Apache UIMA to allow the FlowController to set
the ResultSpecification of annotators it calls, so that we can have a
capabilityLanguageFlow that behaves the exactly the same as it did in
1.x?

(2) If the answer to #1 is no, should we remove the
capabilityLanguageFlow from Apache UIMA 2.2, perhaps renaming it to
languageFlow or something like that?


We might (?) have reached some sort of reluctant consensus on (1) that
we aren't going to change this, with Michael and perhaps Thilo
agreeing to disagree, but I am not sure.

We seem far apart still on (2).  Ultimately I don't think I would
stand in the way of renaming capabilityLanguageFlow if that is the
consensus of the other committers.

-Adam




On 6/12/07, Adam Lally <alally@alum.rpi.edu> wrote:
> On 6/12/07, Thilo Goetz <twgoetz@gmx.de> wrote:
> > no, this is not an option.
>
> So you vote against the option... that is fine, it doesn't mean it
> isn't an option *to consider*.
>
> > We have users who use the capabilityLanguageFlow
> > in 1.4 in ways that will break in 2.x.  I don't want them to migrate to Apache
> > UIMA, happily use that flow and then have things break in subtle ways.  We
> > either fix it so it's backwards compatible, or we rename it so people don't
> > think it's the same.
> >
>
> We said in the 2.0 "what's new" - there have been changes in the
> ResultSpecification, we don't guarantee applications that use them
> will be backwards compatible.  We can add to that the same thing for
> languageCapabilityFlow.  Sometimes, things change between 1.x and 2.x.
>  Removing/renaming capabilityLanguageFlow completely just unilaterally
> breaks everybody just to avoid confusing a handful (or less) of users
> that might have been effected.
>
> > >
> > > While adding a SimpleStepWithResultSpec is a possibility for backwards
> > > compatibility, I'm really not that happy with that idea going forward,
> > > since it encourages people to build applications that rely on Result
> > > Specifications.  I think Result Specifications should only be a
> > > performance optimization.  Since only a handful of annotators in the
> > > world pay attention to their Result Spec at all, I think it's not a
> > > very good idea for applications to rely on them.  In your example, if
> > > the second annotator produces tokens will this be just a performance
> > > problem or will the application actually break?
> >
> > The application will break.
> >
>
> I recommend we make it explicit that applications cannot expect
> annotators to observe the Result Specification. If an application is
> going to do this then it might as well rely on the annotator to check
> the CAS for existing Tokens before creating new ones, as I suggested.
> This one obscure use case can be handled in an application-specific
> way, rather than adding complexity to the framework.
>
> Result Specs are complicated enough (the original point of this issue)
> if they are deterministically determined based on the descriptors.  If
> FlowControllers can set them arbitrarily the poor users will have no
> hope to understand why they aren't getting the results they expect.
>
> -Adam
>

Mime
View raw message