tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ken Krugler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1017) DefaultHtmlMapper misses some safe elements
Date Tue, 06 Nov 2012 14:56:12 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491492#comment-13491492
] 

Ken Krugler commented on TIKA-1017:
-----------------------------------

Hi Daniel - this sounds like a question for the mailing list. If, after discussion, it appears
to be a bug then you'd file a Jira issue. Using the mailing list would also be the best way
to get input from the author (I think that was Jukka).
                
> DefaultHtmlMapper misses some safe elements
> -------------------------------------------
>
>                 Key: TIKA-1017
>                 URL: https://issues.apache.org/jira/browse/TIKA-1017
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Daniel Bonniot de Ruisselet
>
> The code of DefaultHtmlMapper says that the list of "safe" elements is based on http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> Elements like <sub> and <i> are not included in the safe list. Is this intentional
(a comment with the rationale would be useful) or should they be added?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message