tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2191) Apply current .docx unit tests to experimental SAX parser and fix or document as necessary
Date Wed, 07 Dec 2016 20:34:59 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15729844#comment-15729844
] 

Tim Allison commented on TIKA-2191:
-----------------------------------

I added paragraph numbering, styles and bookmarks.  I think I'm going to punt on handling
footnotes and comments closer to where they belong in the document.  I'll document that as
one of the major differences and call it a day...unless there is an urgent need for this.

Once I apply the patches to 2.x.  I'll resolve this issue and run the regression tests.

> Apply current .docx unit tests to experimental SAX parser and fix or document as necessary
> ------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2191
>                 URL: https://issues.apache.org/jira/browse/TIKA-2191
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>
> There are many areas for clean up to ensure that the new SAX .docx parser yields similar
results to the legacy DOM .docx parser.  Let's use this issue to track work on improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message