tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: Container Extractor?
Date Tue, 07 Sep 2010 08:59:09 GMT

On Mon, Sep 6, 2010 at 1:19 PM, Nick Burch <nick.burch@alfresco.com> wrote:
> Finally, pull vs push for the consumer.
> [...]
> I think the former would be a little bit more work for us, but is likely to
> lead to cleaner and simpler code for consumers. What do people think?

I'd start with a push mechanism as that supports streaming and is
better in line with the current design of Tika.

We can then add a pull layer on top of that either by using a
background thread like done by the ParsingReader class or by spooling
component data to temporary files or in-memory buffers when a
random-access backend is not available.


Jukka Zitting

View raw message