tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-2192) Extract embedded files from headers, footers, footnotes, etc from docx
Date Tue, 06 Dec 2016 01:55:58 GMT
Tim Allison created TIKA-2192:
---------------------------------

             Summary: Extract embedded files from headers, footers, footnotes, etc from docx
                 Key: TIKA-2192
                 URL: https://issues.apache.org/jira/browse/TIKA-2192
             Project: Tika
          Issue Type: Improvement
            Reporter: Tim Allison


While working on an alternate SAX parser for docx/docm, I found that we're not currently extracting
embedded documents from headers, footers, footnotes, endnotes or comments.  We should fix
this in our classic DOM parser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message