tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-652) Custom metadata from more formats
Date Tue, 03 May 2011 05:17:03 GMT
Custom metadata from more formats
---------------------------------

                 Key: TIKA-652
                 URL: https://issues.apache.org/jira/browse/TIKA-652
             Project: Tika
          Issue Type: Improvement
          Components: parser
    Affects Versions: 0.9
            Reporter: Nick Burch
            Assignee: Nick Burch


Currently, Tika handles custom metadata from Open Document files. Any custom metadata is returned
with a custom: prefix (see OpenOfficeParserTest#testOO2Metadata for example)

Microsoft file formats don't include custom metadata in the parsing, and nor does PDF

Assuming we're happy with including custom metadata from Documents in the parsing step, with
the custom: prefix, I'll go ahead and add it for the Microsoft (ole2 and ooxml) and PDF parsers

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message