tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Serban Alexe (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-2555) Text with [underline] + [another format] in word document generates overlapping html tags.
Date Thu, 25 Jan 2018 17:08:00 GMT
Serban Alexe created TIKA-2555:
----------------------------------

             Summary: Text with [underline] + [another format] in word document generates
overlapping html tags.
                 Key: TIKA-2555
                 URL: https://issues.apache.org/jira/browse/TIKA-2555
             Project: Tika
          Issue Type: Bug
    Affects Versions: 1.17
            Reporter: Serban Alexe
         Attachments: Clipboard02.jpg

I have a sample _.docx_ document which contains one single line of text**++.

Making that text to be:
 * +underlined+
 ** AND at least one of the following two
 * _italic_
 * *bold*****

will cause the generated _.xhtml_ file to contain overlapping tags.

 

_+Example+_:

*+The quick brown fox jumps over the lazy dog.+*

will result in

<b><u>The quick brown fox jumps over the lazy dog.</b></u> 

which causes some browser (Firefox, Chrome) to give an error and not display the content of
the file...

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message