tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (Jira)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-3009) XML Parser reset() detection no working in weblogic
Date Thu, 12 Dec 2019 13:48:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994687#comment-16994687

Tim Allison commented on TIKA-3009:

[~nick], I was going to go with "exciting".  To confirm, you can't control which SAXParser
is used by, e.g., -Djavax.xml.parsers._SAXParserFactory_

> XML Parser reset() detection no working in weblogic
> ------------------------------------------------------------
>                 Key: TIKA-3009
>                 URL: https://issues.apache.org/jira/browse/TIKA-3009
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.20, 1.21, 1.22, 1.23
>         Environment: JDK 1.8.0_231
> Oracle Weblogic Server
>            Reporter: Daniel
>            Priority: Critical
> Starting with tika 1.20 the org.apache.tika.utils.XMLReaderUtils try to detect if a XML
parser supports the reset() functionality by calling reset() during the poolParser creation
and watching for a UnsupportedOperationException.
> This unfortunately does not work in weblogic server as the attained RegistryParser itself
caches underlying SAX parsers. Only after first use the reset() of the underlying SAXParser
is called and will produce the UnsupportedOperationException. A first call to reset() will
not produce this exception and XMLReaderUtils thinks, the parser supports reset() which in
effect is not true.
> This results in exhaustion of the parser pool and intermittent errors and delays in processing
as the pool is reset when a parser is not available after 5 minutes.

This message was sent by Atlassian Jira

View raw message