tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: Providing a Default Tika Configuration
Date Tue, 25 Sep 2007 15:55:22 GMT

On 9/25/07, kbennett <kbennett@bbsinc.biz> wrote:
> This means that every time a parse methods that uses a default configuration
> is used, the default configuration's XML will be reparsed.  This may not be
> a big deal for apps that only occasionally do this, but for an app whose
> mission is to parse documents, it seems kind of wasteful, especially when it
> can be remedied with a small number of simple lines of code.  Certainly I
> can get the default configuration once, hold onto it, and then call the
> parse methods that take it, but it seems odd to me that I would have to do
> that.  I realize it's a minor issue, though.

I would argue that that's (reusing the configuration instance) the
preferred mode of operation. Currently I wouldn't do that due to the
mutability of Content instances, but as we get to the point of having
stateless Parser instances, I'd even advocate instantiating the full
set of configured parsers when your application starts and reusing
this configuration for any number of documents.


Jukka Zitting

View raw message