hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferdy (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-7614) Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException
Date Wed, 07 Sep 2011 13:22:10 GMT
Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException

                 Key: HADOOP-7614
                 URL: https://issues.apache.org/jira/browse/HADOOP-7614
             Project: Hadoop Common
          Issue Type: Bug
          Components: conf
    Affects Versions: 0.21.0
            Reporter: Ferdy
            Priority: Minor

When using an inputstream as a resource for configuration, reloading this configuration will
throw the following exception:

Exception in thread "main" java.lang.RuntimeException: org.xml.sax.SAXParseException: Premature
end of file.
	at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1576)
	at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1445)
	at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1381)
	at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
	at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
	at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
	at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
	at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1504)
	... 4 more

To reproduce see following testcode:
    Configuration conf = new Configuration();
    ByteArrayInputStream bais = new ByteArrayInputStream("<configuration></configuration>".getBytes());
    conf.addResource("core-site.xml"); //just add a named resource, doesn't matter which one

Allowing inputstream resources is flexible, but in cases such as this in can lead to difficult
to debug problems.

What do you think is the best solution? We could:
A) reset the inputstream after it is read instead of closing it (but what to do when the stream
does not support marking?)
B) leave it up to the client (for example make sure you implement close() so that it resets
the steam)
C) when reading the inputstream for the first time, cache or wrap the contents somehow so
that is can be read multiple times (let's at least document it)
D) remove inputstream method altogether
e) something else?

For now I have attached a patch for solution A.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message