tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bob Paulin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1762) Create Executor Service from TikaConfig
Date Thu, 15 Oct 2015 13:32:05 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958898#comment-14958898

Bob Paulin commented on TIKA-1762:

I agree that for a ThreadPool to be useful it should be shared across parsers and documents.
 I think the first 3 bullet points are straightforward and non-controversial.  I think the
4th point on setting a small ThreadExecutor as a default inside each parser presents some
problems outside of TikaConfig.  If we instantiated a small thread pool in the default constructor
the parsers that use threads there is the potential to create many thread pools which will
be a drag on system resources.  In addition to developers instantiating parsers themselves
there is also the ServiceLoader to consider.  The static service providers is coded to always
use the default constructor which could create a large number of unshared thread pools.  We
could use reflection to select the ExecutorService constructor in the service loader to address
it.  I think we might be setting a bad practice if we started giving each parser there own
thread pool.  I think it's preferable to have a solution where the default thread pool is
shared.  I'd be interested in how often developers are instantiating the parsers themselves
vs using TikaConfig vs using the ServiceLoader/DefaultParser.

> Create Executor Service from TikaConfig
> ---------------------------------------
>                 Key: TIKA-1762
>                 URL: https://issues.apache.org/jira/browse/TIKA-1762
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Bob Paulin
>             Fix For: 1.11
> Create a configurable executor service that is configurable from the TikaConfig.
>  Konstantin Gribov added a comment - 23/Sep/15 09:55
> Bob Paulin, I have two ideas on the issue:
>     by default use common thread pool, configured via and contained in TikaConfig as
Tyler Palsulich suggested,
>     you can pass thread pool for parser invocation via ParserContext with fallback to
default if now thread pool/executor service in context.
> Also o.a.tika.Tika#parse(InputStream, Metadate) produces o.a.tika.parser.ParsingReader
and anonymous Executor with unbounded daemon thread creation.

This message was sent by Atlassian JIRA

View raw message