tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (TIKA-388) Don't trust streams that claim mark support
Date Fri, 19 Mar 2010 13:52:27 GMT

     [ https://issues.apache.org/jira/browse/TIKA-388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jukka Zitting resolved TIKA-388.

       Resolution: Fixed
    Fix Version/s: 0.7
         Assignee: Jukka Zitting

As of revision 925217 the AutoDetectParser wraps all incoming streams to BufferedInputStream
regardless of whether they claim mark support or not. Resolving as fixed.

> Don't trust streams that claim mark support
> -------------------------------------------
>                 Key: TIKA-388
>                 URL: https://issues.apache.org/jira/browse/TIKA-388
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.7
> As seen on tika-dev@ and in JCR-2576, there are some InputStream implementations that
claim to support the mark feature, but lose the mark as soon as the end of stream has been
reached. There's no way for a client to detect such behaviour, so it's probably best for Tika
to always use BufferedInputStream to wrap incoming streams when mark support is needed. This
may cause one layer of extra buffering, but avoids problems with such broken streams.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message