tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gregory Chanan (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1096) CompressorParser: Add support for handling concatenated InputStreams
Date Sat, 23 Mar 2013 00:05:15 GMT
Gregory Chanan created TIKA-1096:
------------------------------------

             Summary: CompressorParser: Add support for handling concatenated InputStreams
                 Key: TIKA-1096
                 URL: https://issues.apache.org/jira/browse/TIKA-1096
             Project: Tika
          Issue Type: Improvement
          Components: parser
    Affects Versions: 1.4
            Reporter: Gregory Chanan
            Priority: Minor


COMPRESS-220 added support for CompressorStreamFactory to return an InputStream with decompressConcatenated
set to true.  Today, Tika uses the CompressorStreamFactory without this option, which caused
me some problems parsing some gzipped files that required this option.

Today I have to do some pre-processing on the InputStreams before I send them to Tika; it
would be great if Tika could handle this for me.

I wrote up a quick patch that adds this option; I'll attach it soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message