tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akram, Hassan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1255) WordExtractor - bold hyperlink not closed properly
Date Mon, 16 Mar 2015 02:33:39 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362668#comment-14362668
] 

Akram, Hassan commented on TIKA-1255:
-------------------------------------

Hi,

I am on annual leave and will return on 5th January 2015.

If you want to discuss anything related to Colossus team or 14R1 SP2, please reach out to
Craig Pinkerton.
For anything else, please contact Leigh Dastey

I will pick up emails on my return.

Regards,
Hassan



> WordExtractor - bold hyperlink not closed properly
> --------------------------------------------------
>
>                 Key: TIKA-1255
>                 URL: https://issues.apache.org/jira/browse/TIKA-1255
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.2, 1.3, 1.4, 1.5
>         Environment: Any
>            Reporter: Alan Hunter
>            Priority: Minor
>         Attachments: WordExtractor.java, WordParserTest.java, example.doc, testWORD_bold_hyperlink.doc,
testWORD_italic_hyperlink.doc, testWORD_strikethrough_hyperlink.doc
>
>
> If a Word document contains a bold hyperlink, the resulting xhtml is:
> <a href="http://www.testdomain.com/support/workcentre-7232-7242/file-download/enus.html?operatingSystem=macosx108&amp;amp;fileLanguage=en&amp;amp;contentId=126220&amp;amp;from=downloads&amp;amp;viewArchived=false"><b>Test
link</a></b>
> The closing bold and anchor tags are transposed, which isn't valid XHTML.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message