tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joerg Ehrlich <jehrl...@adobe.com>
Subject [metadata] Input on reorganization of Metadata interfaces
Date Fri, 04 May 2012 13:43:41 GMT

I wanted to start submitting patches for the following and would like your input on that:

Create one "Core Properties" interface for the Metadata class which contains just the keys
for the properties which should be directly addressable through the Metadata class in the
future. Those are all DublinCore plus copyright and a bit of other relevant stuff. Those keys
will be the ones we have had before like "Title", "Keywords", "Format", etc.
The keys will always link to properties of other namespace interfaces like:
String Title = DublinCore.Title.getName();
String Author = DublinCore.Creator.getName();

On a side note: This version is a bit different for the DublinCore namespace to what is provided
by TIKA-859. Instead of introducing a new DC_Creator property I would keep the current Creator
property in the Core interface and by removing DublinCore interface from the Metadata class,
the core property can easily alias the DC ones like above. I would provide a new patch for

The keys of all other interfaces currently included in the Metadata class will be either removed
to avoid conflicts with the Core interface or declared @Deprecated and replacements will be
offered by specific namespace interfaces.
For example:
MSOffice.Author -> removed, replaced by new CoreProperties.Author which links to DublinCore.Creator
MSOffice.Template -> kept, but declared deprecated and replaced by new OfficeOpenXMLExtended.Template

In the long term all interfaces except the core one should be removed from the Metadata class,
otherwise we end up with tons of naming conflicts.


Jörg Ehrlich | Computer Scientist | XMP Technology | Adobe Systems | joerg.ehrlich@adobe.com
| work: +49(40)306360

View raw message