uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: changing edge case impl details in casCopiers
Date Fri, 01 Apr 2016 13:57:55 GMT
Hi Richard,

Thanks for this use-case.  I think there may be 2 subcases.

1) The views, A and B, are in the same CAS, and
2) The views, A and B, are in different CASes

In case 1), with this new proposal the annotations copied from view A to B would
have their "sofa" reference continue to point to the text in view A.  This means:

a) The references into the text are still "valid", but of course point to the
text in view A.
b) To do the updating process to have them point to the de-xml'ed version of the
text, not only do the begin/end references need to be updated, but the sofa
reference needs to be changed.  We could add an API to update that to the
current view's.

In case 2), the annotations in B would no longer have a valid sofa reference at
all (it would be set to null).
This would clearly be a problem; but once again, we could add an API to update
that to the current view's.


So, it looks like this proposed design change would break the use-case you

The current design would seems to support this use case but only if the two
views are in different CASes.
If they were in the same CAS, I think the current implementation (not tested,
just reading the code) would have the copied Annotations have their sofa
references be to the sofa in CAS A.

Does this match what you're currently seeing?


On 3/31/2016 4:36 PM, Richard Eckart de Castilho wrote:
> On 31.03.2016, at 21:22, Marshall Schor <msa@schor.com> wrote:
>> I'm thinking of changing how cas copier works with respect to managing Sofas and
>> sofa ref updating.  I've written something up here:
>> https://cwiki.apache.org/confluence/display/UIMA/CasCopier+and+Views
>> Comments / feedback / what did I overlook?  appreciated :-) -Marshall
> Consider the following case:
> - there are two views, A and B
> - the text in B has been derived from A through some transformation, e.g. the removal
of XML tags
> - A contains UIMA annotations that represent the XML tags and the point into the text
in A
> - as part of a second transformation process, all annotations in A are to be copied into
> - after the copy has been performed, the offsets of the copied annotations are updated
> Would such a scenario still be supported after the changes you suggest?
> Best,
> -- Richard

View raw message