tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörg Ehrlich (JIRA) <j...@apache.org>
Subject [jira] [Updated] (TIKA-929) Consistent, namespaced definitions for office file related metadata
Date Tue, 12 Jun 2012 12:44:42 GMT

     [ https://issues.apache.org/jira/browse/TIKA-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jörg Ehrlich updated TIKA-929:
------------------------------

    Attachment: tika_OOXMLOffice_namespaces.patch

This patch should help to resolve this issue.

The patch contains the following:
* Definition of the OOXML namespace properties in Tika-core, except those properties which
have equivalent definitions already in the Office Namespace interface.
* Declared the old properties in the MSOffice interface deprecated
* Adjustment of the related parsers to additionally map to the new OOXML properties
* Adjustment of related tests.
                
> Consistent, namespaced definitions for office file related metadata
> -------------------------------------------------------------------
>
>                 Key: TIKA-929
>                 URL: https://issues.apache.org/jira/browse/TIKA-929
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Nick Burch
>         Attachments: tika_OOXMLOffice_namespaces.patch
>
>
> Currently, we have the MSOffice metadata definitions, which is a mixture of Properties
and Strings, none of them namespaced. Despite the name, the keys apply to a wide range of
Office Documents (not just MS ones), and the keys are taken from a mixture of sources.
> Similar to TIKA-925 / TIKA-928, we should replace these with prefixed versions drawn
from a few well known externally defined namespaces, then deprecate the old ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message