tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2192) Extract embedded files from headers, footers, footnotes, etc from docx/m
Date Tue, 06 Dec 2016 14:47:58 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725698#comment-15725698
] 

Hudson commented on TIKA-2192:
------------------------------

SUCCESS: Integrated in Jenkins build Tika-trunk #1150 (See [https://builds.apache.org/job/Tika-trunk/1150/])
TIKA-2192 - add extraction of embedded objects in DOM docx parser from (tallison: rev 615bf75fc11e8fc299be550b8cd4bb24f45a264a)
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/XWPFWordExtractorDecorator.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java
update changes for TIKA-2191 and TIKA-2192 (tallison: rev 5425d02a1ed97ce5f884a076f55ad8197cc6ac7b)
* (edit) CHANGES.txt


> Extract embedded files from headers, footers, footnotes, etc from docx/m
> ------------------------------------------------------------------------
>
>                 Key: TIKA-2192
>                 URL: https://issues.apache.org/jira/browse/TIKA-2192
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>
> While working on an alternate SAX parser for docx/docm, I found that we're not currently
extracting embedded documents from headers, footers, footnotes, endnotes or comments.  We
should fix this in our classic DOM parser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message