tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Using standard XMP schemas for image and audio metadata
Date Sat, 07 Feb 2009 19:32:25 GMT

Following up from the Dublin Core discussion we had earlier, now with
something a bit more concrete:

The current image and audio parsers use hardcoded strings like
"width", "height", "encoding" and "samplerate" for extracted metadata.
The semantics of these metadata keys are nowhere documented and little
thought has been put on interoperability with external metadata
applications. To improve things I'd like to replace these custom
metadata keys with keys defined in part 2 of the XMP specification

More specifically, I'd like to start using the following keys for
image and audio metadata:

    * "tiff:ImageWidth" instead of "width"
    * "tiff:ImageHeight" instead of "height"
    * "xmpDM:audioCompressor" instead of "encoding"
    * "xmpDM:audioSampleRate" instead of "samplerate"
    * "xmpDM:audioSampleType" instead of "bits"
    * "xmpDM:audioChannelType" instead of "channels"

The semantics of these metadata keys would be as documented in the XMP
spec. Since we don't support namespacing of metadata keys (yet, see
TIKA-61), these keys would simply use the preferred "tiff" and "xmpDM"
prefixes embedded in the metadata key strings.

[1] http://www.adobe.com/devnet/xmp/pdfs/XMPSpecificationPart2.pdf


Jukka Zitting

View raw message