tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2191) Apply current .docx unit tests to experimental SAX parser and fix or document as necessary
Date Mon, 12 Dec 2016 13:15:59 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15741859#comment-15741859
] 

Hudson commented on TIKA-2191:
------------------------------

UNSTABLE: Integrated in Jenkins build Tika-trunk #1153 (See [https://builds.apache.org/job/Tika-trunk/1153/])
TIKA-2191: fixes after regression testing on TIKA_1302 corpus: 1) add (tallison: rev faf6c2b24814ded27f05388f8a417c2df5bf5c7a)
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/SXWPFWordExtractorDecorator.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXWPFExtractorTest.java
* (add) tika-parsers/src/test/resources/test-documents/testWORD_template.dotx
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFDocumentXMLBodyHandler.java


> Apply current .docx unit tests to experimental SAX parser and fix or document as necessary
> ------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2191
>                 URL: https://issues.apache.org/jira/browse/TIKA-2191
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>
> There are many areas for clean up to ensure that the new SAX .docx parser yields similar
results to the legacy DOM .docx parser.  Let's use this issue to track work on improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message