uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl (JIRA) <...@uima.apache.org>
Subject [jira] [Created] (UIMA-2524) TextMarker html conversion to plain text is not working correctly
Date Thu, 13 Dec 2012 17:58:11 GMT
Peter Klügl created UIMA-2524:
---------------------------------

             Summary: TextMarker html conversion to plain text is not working correctly
                 Key: UIMA-2524
                 URL: https://issues.apache.org/jira/browse/UIMA-2524
             Project: UIMA
          Issue Type: Bug
          Components: TextMarker
            Reporter: Peter Klügl
            Assignee: Peter Klügl


The HTMLAnnoator shipped with TextMarker is able to strip the html tag and to create an additional
view with the plain text. During this step the tag information is converted to annotations,
whose offsets are adapted according to the removed tags. This functionality is not working
correctly: the tags of the body of the html document are not removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message