On Fri, 4 May 2012, Joerg Ehrlich wrote:
>>> The keys will always link to properties of other namespace interfaces like:
>>> String Title = DublinCore.Title.getName(); String Author =
>>> DublinCore.Creator.getName();
>
>> Won't that break existing parsers and consumers though? As Title will
>> suddenly change from being "title" to "dc:title", won't it?
>
> If they are not using the Tika constants themselves but their values
> instead, then yes.
That'll break things like Alfresco then. (We do the mapping from Tika
metadata to Alfresco metadata on the strings, rather than by Metadata
constants, so it's more flexible and easier for users to extend). I
suspect Alfresco isn't the only consumer of Tika's metadata that does the
same thing. Anything that uses tika-cli will likewise be string based, not
Metadata Constant based
> Thinking about it, I am actually not sure whether we really need to have
> the prefixes in the names anymore if the new keys are properties instead
> of strings. Then we could implement other means to identify the
> namespace for a property, by storing it in the property for example :)
I think the current ones that have a prefix are easier and cleaner to
understand than the un-prefixed ones. If we're going to be basing the keys
explicitly on a standard, I think we ought to make that explicit wherever
we can, including in the key names. It will be a faff for people to change
over, and for us to handle in the mean time, but I think if we're going to
be making a change of this scale we should take the chance to do it all
properly
Nick
|