tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2265) Problem with footnotes/endnotes in Tika.parseToString with MS Word (.docx) files
Date Mon, 13 Feb 2017 13:39:42 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15863681#comment-15863681
] 

Tim Allison commented on TIKA-2265:
-----------------------------------

Thank you for opening this.  I'll take a look.

> Problem with footnotes/endnotes in Tika.parseToString with MS Word (.docx) files
> --------------------------------------------------------------------------------
>
>                 Key: TIKA-2265
>                 URL: https://issues.apache.org/jira/browse/TIKA-2265
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.14
>         Environment: N/A
>            Reporter: Mike Rodent
>            Assignee: Tim Allison
>            Priority: Minor
>              Labels: newbie
>
> It seems to be the case that a footnote numbered "1" in the real document will be outputted
by Tika.parseToString() as "2" in the footnote reference, and "2" in the corresponding footnote
body text.... real footnote "2" becomes "3", "3" becomes "4", etc.  Have not yet looked at
source code ... I can't imagine it would be difficult to correct this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message