tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Gibson (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1000) secure-processing not supported by some JAXP implementations and causes mime type detection to fail
Date Tue, 02 Oct 2012 23:43:07 GMT
John Gibson created TIKA-1000:
---------------------------------

             Summary: secure-processing not supported by some JAXP implementations and causes
mime type detection to fail
                 Key: TIKA-1000
                 URL: https://issues.apache.org/jira/browse/TIKA-1000
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.2
         Environment: Android 2.3.6
            Reporter: John Gibson


The XmlRootExtractor class tries to set the secure-processing feature that JAXP requires all
parser implementations to support. Unfortunately Android (and presumably some other parsers)
don't support the feature.  When run it causes the following exception: "org.xml.sax.SAXNotRecognizedException:
Feature 'http://javax.xml.XMLConstants/feature/secure-processing' is not recognized."

However this exception is swallowed and ignored by XmlRootExtractor which returns null.  When
org.apache.tika.mime.MimeTypes sees that no root element was found it assumes that the file
is not valid XML and downgrades the result to text/plain.

This was fixed long ago by TIKA-271, but as Michael Pisula points out, commit 1004050 broke
it again.  I'd simply reopen that issue, but I don't have permission to do that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message