uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: small memory footprint tradeoff configuration
Date Wed, 11 Mar 2009 12:26:14 GMT
Marshall Schor wrote:
[...]
> I agree that backward compatibility is important and is an issue.  To
> help the transition to this new scheme, I think an overall global switch
> is needed (similar to the switches we have for JCas "interning") that
> would by default make things work the way they do now.  A user
> interested in small-footprint operation (and in trading off some
> additional processing cycles to achieve it) would enable this switch.
> 
> To help it "work" - we would allow things to continue to operation which
> "set" a non-stored feature - theset would just become no-ops.  Then if
> the annotator wasn't paying attention to ResultSpecification, and tried
> to set features that were not used, it would still work. 
> 
> On the other end, if an annotator actually made use of a particular
> feature, but didn't specify it in its "input capability specification",
> that would fail with this scheme.  The failure would be some kind of
> Java exception, which would probably be noticed.  To recover, a user of
> such a component would modify the input capability specification to
> indicate that that feature was needed. 

If a feature is defined in the type system, it should be there
for the annotator writer to use.  Who are we to know how people
will use those features?

> 
> As I write this, I notice that the input capability specification for a
> primitive annotator doesn't quite fit the meaning hear - because I think
> it means that this annotator needs that feature upon input - and this
> edge case - where the annotator itself produces this feature, and then
> also uses it - is not part of that definition. We could either expand
> the meaning here to include this edge case, or (possibly a better
> option) introduce, explicitly, another piece of metadata indicating that
> a particular type/field was both created and used by this one primitive
> annotator.  A third option could be to store these "unused" features if
> set (in some out-of-line temporary storage) for the duration of the
> running of a particular annotator, just in case these were "used" by the
> same annotator, and then discard that extra storage after the annotator
> exits.  This would be a big (but temporary) storage hit, though, so I
> don't think I would want to do this.

I vote we don't make things even more complicated than they
already are, and educate those people who need a performance
boost.

--Thilo


Mime
View raw message