uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timo Boehme (JIRA)" <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-6064) External DTD usage in XML descriptors disabled during build revision upgrade
Date Thu, 13 Jun 2019 08:21:00 GMT

    [ https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862818#comment-16862818

Timo Boehme commented on UIMA-6064:

Hi, I've checked it and actually it is the other way around: the DISALLOW_DOCTYPE_DECL is
important to allow any DTD declaration (internal or external) - otherwise any <!DOCTYPE
...> will produce an error. The LOAD_EXTERNAL_DTD ([http://apache.org/xml/features/nonvalidating/load-external-dtd)]
is only recognized if not validating (if 'true' it will parse (but not 'use') the DTD and
report contained errors). If validation is set to 'true' (e.g. 'mSchemaValidationEnabled'
is 'true') it sets feature '[http://xml.org/sax/features/validation'] to 'true' which overwrites

To summarize: a switch/flag for DISALLOW_DOCTYPE_DECL is required.

> External DTD usage in XML descriptors disabled during build revision upgrade
> ----------------------------------------------------------------------------
>                 Key: UIMA-6064
>                 URL: https://issues.apache.org/jira/browse/UIMA-6064
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.10.2SDK
>            Reporter: Timo Boehme
>            Priority: Major
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed (fixed, without
the possibility to adjust it) to not allow for DTD and its loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the DISALLOW_DOCTYPE_DECL
and LOAD_EXTERNAL_DTD feature. Before the SAXParserFactory was created without adjusting these
> While I understand that this was done to prevent malicious XML from doing nasty things,
the kind how it was done is problematic:
>  * the change happened in a revision build, no major or minor number change
>  * it was not documented
>  * one cannot simply change it back like using an environment variable, method call etc.
- the only workaround is to do a problematic sub-classing of XMLParser_impl with additional
configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in modular chunks
using entities etc. Thus it is important (for the time being) to use DTD there - and we know
that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such changes
should not occur in a build upgrade or it should at least be possible to get the old behavior
easily back.

This message was sent by Atlassian JIRA

View raw message