tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Updated: (TIKA-317) Annotation-based Tika configuration
Date Wed, 17 Feb 2010 14:09:28 GMT

     [ https://issues.apache.org/jira/browse/TIKA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jukka Zitting updated TIKA-317:

    Attachment: TIKA-317.patch

The attached patch introduces the following new Parser method:

     * Returns the set of media types supported by this parser when used
     * with the given parse context.
     * @since Apache Tika 0.7
     * @param context parse context
     * @return immutable set of media types
    Set<MediaType> getSupportedTypes(ParseContext context);

An explicit method is better than static annotations since it allows the parsers to better
adapt to situations where optional functionality like certain parser libraries are not available.
This approach also works for things like parser compositions and decorations.

The patch modifies the configuration mechanism so that the getSupportedTypes() method is used
whenever a <parser/> entry without embedded <mime/> elements is encountered. This
should maintain reasonable backwards compatibility with existing config files until Tika 1.0.

> Annotation-based Tika configuration
> -----------------------------------
>                 Key: TIKA-317
>                 URL: https://issues.apache.org/jira/browse/TIKA-317
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>         Attachments: TIKA-317.patch
> I'd like to simplify Tika configuration and make it easier to customize by pushing the
information in tika-config.xml to Parser annotations and Java SPI service files.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message