uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thilo Goetz (JIRA)" <uima-...@incubator.apache.org>
Subject [jira] Closed: (UIMA-1764) Wrong behavior of the annotation index subiterator
Date Fri, 16 Apr 2010 13:46:27 GMT

     [ https://issues.apache.org/jira/browse/UIMA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thilo Goetz closed UIMA-1764.
-----------------------------

    Resolution: Not A Problem

This does in fact work as designed.  The crucial piece of information that you didn't mention
in your quote from the docs is that we must also have annot < b.  For your example, our
b, the token "UIMA-framework" precedes the named entity "UIMA" by virtue of the fact that
it starts at the same position, and is longer.  This is true independently of the type priorities.
 The type priorities only come into play when two annotations start and end at the same position.

When you change the named entity to " UIMA", it now precedes "UIMA-framework", annot <
b is true, and we see the expected result.

Please let us know if you still have questions, or if you have any suggestions on how to improve
the documentation.


> Wrong behavior of the annotation index subiterator
> --------------------------------------------------
>
>                 Key: UIMA-1764
>                 URL: https://issues.apache.org/jira/browse/UIMA-1764
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.2.2, 2.3
>            Reporter: Hannes Korte
>            Assignee: Thilo Goetz
>         Attachments: files.zip
>
>
> I noticed a strange behavior of the annotation index subiterator in
> uimaj 2.2.2 and 2.3.0.
> Consider the sentence: 'Testing the UIMA-Framework'
> with tokens: 'Testing' 'the' 'UIMA-Framework'
> and the named entity: 'UIMA'
> The type priorities list NamedEntity on top of the Token type.
> If I call the Token subiterator for the NamedEntity 'UIMA' with
> strict=false, I get an empty result. According to the docs, the
> definition of Tokens contained in the NamendEntity is in the
> strict=false setting defined as:
>   annot.getBegin() <= b.getBegin() <= annot.getEnd()
> for NamedEntity annot and Token b. This is true for 'UIMA' and
> 'UIMA-Framework', but the subiterator is empty.
> If I change the NamedEntity to ' UIMA' (including the preceeding space),
> then it works correctly, and the Token 'UIMA-Framework' is contained in
> the subiterator.
> I appended a simple java class with all needed files to demonstrate the
> problem. Any ideas?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message