uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Is buffering needed when setting org.xml.sax.InputSource.InputSource?
Date Mon, 12 Feb 2007 19:15:52 GMT
Adam Lally wrote:
> I doubt it.  Is there something that led you to believe this would be
> necessary?  

Just doing some code inspection and seeing this - that it is perfectly 
feasible to
pass a buffered version of the input to this, and that the general 
contract for IO
seems to imply that you should use buffering for performance considerations.
But I see from some web surfing that the Xerces impl does some buffering,
and you can set the buffer size via a property (do we do that?  default 
= 2k I think,
and the Apache license is about 1K by itself :-) ).

I guess some simple test would tell...

Some web surfing turned up:

Parsers like Apache Xerces have the ability to set the input buffer size:

|// Set the chunk to read in by SAX
      new Integer(2048));

See also http://xerces.apache.org/xerces2-j/properties.html
which gives some advice on how large to set this.



View raw message