tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ken Krugler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-298) CompositeParser.getParser() should use mimetype hierarchy when falling back
Date Sun, 11 Oct 2009 15:48:31 GMT

    [ https://issues.apache.org/jira/browse/TIKA-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764471#action_12764471

Ken Krugler commented on TIKA-298:

Jukka said on the mailing list:

Note that both the MimeType.getSuperType()  method already does some
of this and we have related supertype settings stored in the
tika-mimetypes.xml configuration. The type registry could also be told
about the +xml convention and related implicit supertype settings like
the ones encoded in the MediaType.isSpecializationOf() method.

(Note that we currently have both MimeType and MediaType classes for
similar purposes. This is due to an ongoing redesign of the mime type
registry. For now it's probably best to work on the MimeType class
until the redesign is more complete.)

> CompositeParser.getParser() should use mimetype hierarchy when falling back
> ---------------------------------------------------------------------------
>                 Key: TIKA-298
>                 URL: https://issues.apache.org/jira/browse/TIKA-298
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Ken Krugler
> CompositeParser.getParser() doesn't use supertypes when falling back - if it can't get
a parser for the exact mimetype, then it goes
> straight to the fallback parser.
> So, for example, if the file mimetype is application/<whatever>+xml, and no parser
exists for it, then you get the default "do nothing" parser versus the XML parser.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message