tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: Customzing TikaConfig or rather getParser
Date Mon, 25 Aug 2008 08:32:58 GMT
Hi,

On Mon, Aug 25, 2008 at 9:06 AM, Michael Wechner
<michael.wechner@wyona.com> wrote:
> I think this is where the problem is, I mean the getParser(String) method.
>
> I would like to overwrite this method by implementing my own chain of
> responsibility.

How about the following:

    public class MyCustomParser extends CompositeParser {

        public MyCustomParser throws TikaException {
            setConfig(TikaConfig.getDefaultConfig());
            // or whatever config you want
        }

        protected Parser getParser(Metadata metadata) {
            // Custom code to select an appropriate parser
            // based on the input metadata (mime type,
            // document path, whatever) passed by the client.
            // Or fallback to:
            return super.getParser(metadata);
        }

    }

Your client code would then look like:

    private Parser parser = new MyCustomParser();

    Metadata metadata = new Metadata();
    metadata.set(Metadata.CONTENT_TYPE);
    // plus whatever other metadata you need in MyCustomParser

    parser.parse(stream, handler, metadata);

One of my design goals for the current Parser interface was was that
you can encapsulate this sort of functionality inside it.

BR,

Jukka Zitting

Mime
View raw message