tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Bonniot de Ruisselet (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1017) DefaultHtmlMapper misses some safe elements
Date Tue, 06 Nov 2012 10:54:12 GMT
Daniel Bonniot de Ruisselet created TIKA-1017:
-------------------------------------------------

             Summary: DefaultHtmlMapper misses some safe elements
                 Key: TIKA-1017
                 URL: https://issues.apache.org/jira/browse/TIKA-1017
             Project: Tika
          Issue Type: Bug
            Reporter: Daniel Bonniot de Ruisselet


The code of DefaultHtmlMapper says that the list of "safe" elements is based on http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd

Elements like <sub> and <i> are not included in the safe list. Is this intentional
(a comment with the rationale would be useful) or should they be added?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message