uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hannes Korte (JIRA)" <uima-...@incubator.apache.org>
Subject [jira] Created: (UIMA-1764) Wrong behavior of the annotation index subiterator
Date Tue, 13 Apr 2010 12:52:50 GMT
Wrong behavior of the annotation index subiterator
--------------------------------------------------

                 Key: UIMA-1764
                 URL: https://issues.apache.org/jira/browse/UIMA-1764
             Project: UIMA
          Issue Type: Bug
    Affects Versions: 2.3, 2.2.2
            Reporter: Hannes Korte
         Attachments: files.zip

I noticed a strange behavior of the annotation index subiterator in
uimaj 2.2.2 and 2.3.0.

Consider the sentence: 'Testing the UIMA-Framework'
with tokens: 'Testing' 'the' 'UIMA-Framework'
and the named entity: 'UIMA'

The type priorities list NamedEntity on top of the Token type.

If I call the Token subiterator for the NamedEntity 'UIMA' with
strict=false, I get an empty result. According to the docs, the
definition of Tokens contained in the NamendEntity is in the
strict=false setting defined as:

  annot.getBegin() <= b.getBegin() <= annot.getEnd()

for NamedEntity annot and Token b. This is true for 'UIMA' and
'UIMA-Framework', but the subiterator is empty.

If I change the NamedEntity to ' UIMA' (including the preceeding space),
then it works correctly, and the Token 'UIMA-Framework' is contained in
the subiterator.

I appended a simple java class with all needed files to demonstrate the
problem. Any ideas?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message