commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremias Maerki <>
Subject Re: RereadableInputStream
Date Mon, 15 Oct 2007 12:11:26 GMT

On 14.10.2007 04:35:27 Niall Pemberton wrote:
> On 10/11/07, Keith R. Bennett <> wrote:
> >
> > Hello, all.  I am working with the Apache Tika project.  We found the need to
> > get a newly opened input stream from the user, and possibly read it multiple
> > times.  I am aware of the mark and release methods, but we needed to support
> > streams of arbitrary length, so I thought we'd have to figure something else
> > out.
> I don't see anything in the javadocs for the mark/reset methods in
> InputStream that prevent it from being used for streams of arbitrary
> length. Is this an assumption based on the fact that the mark method
> specifies a "readLimit" parameter? In the reset method javadocs it
> only says an IOException "Might" be thrown if the readLimit has been
> exceeded - so my take is that it would not be inconsistent to create
> an implementation that ignores that parameter. Better IMO to use these
> than invent a new "rewind" method.

FWIW, I'm in a similar position in Apache FOP. So far we're using
mark/reset, but it does have its limits (aka the readLimit parameter).
We're processing images of arbitrary size and in the case of PNG, for
example, some chunks we need early on may actually be located after the
actual image data. In that case, mark/reset is problematic. I'd love an
BufferedInputStream alternative that can easily deal with mark(Integer.MAX_VALUE).
I'm glad to see that others have the same problem.

The main problem I see here is the different origin of the files to be
read which affects performance and memory consumption. If the original
InputStream is a FileInputStream, the whole buffering should be deferred
to the operating system (i.e. just reset the file cursor). If it's a
relatively slow HTTP connection, you don't have the option of a quick
reset. But sometimes you have the problem that you don't even know what
kind of InputStream you're dealing with (ex. URL.openStream()).

<snip what="Niall's good alternative idea worth looking into"/>

Jeremias Maerki

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message