tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From samir pendharkar <sampendhar...@gmail.com>
Subject Re: [ANNOUNCE] Apache Tika 1.4 Released
Date Mon, 08 Jul 2013 10:54:10 GMT
Hi All,

I updated to tika 1.4 and all else seems to be working fine.
However, RTF file parsing, in particular is failing with
ArrayIndexOutOfBoundException for at least 4-5 files. This was working
correctly in tika 1.3 release, so this seems like regression bug.
Can somebody throw light on this(and how to fix it) -

Following is the exception I am getting with tika 1.4 -
Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
    at
org.apache.tika.parser.rtf.TextExtractor.processControlWord(TextExtractor.java:872)
    at
org.apache.tika.parser.rtf.TextExtractor.parseControlWord(TextExtractor.java:566)
    at
org.apache.tika.parser.rtf.TextExtractor.parseControlToken(TextExtractor.java:492)
    at
org.apache.tika.parser.rtf.TextExtractor.extract(TextExtractor.java:459)
    at
org.apache.tika.parser.rtf.TextExtractor.extract(TextExtractor.java:448)
    at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:56)
    at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)


On Tue, Jul 2, 2013 at 11:31 AM, Chris Mattmann <mattmann@apache.org> wrote:

> The Apache Tika project is pleased to announce the release of Apache Tika
> 1.4. The release contents have been pushed out to the main Apache release
> site and to the Maven Central sync, so the releases should be available as
> soon as the mirrors get the syncs.
>
> Apache Tika is a toolkit for detecting and extracting metadata and
> structured text content from various documents using existing parser
> libraries.
>
> Apache Tika 1.4 contains a number of improvements and bug fixes. Details
> can
> be found in the changes file:
>
> http://www.apache.org/dist/tika/CHANGES-1.4.txt
>
> Apache Tika is available in source form from the following download page:
>
> http://www.apache.org/dyn/closer.cgi/tika/apache-tika-1.4-src.zip
>
> Apache Tika is also available in binary form or for use using Maven 2 from
> the Central Repository:
>
> http://repo1.maven.org/maven2/org/apache/tika/
>
> In the initial 48 hours, the release may not be available on all mirrors.
> When downloading from a mirror site, please remember to verify the
> downloads
> using signatures found on the Apache site:
>
> https://people.apache.org/keys/group/tika.asc
>
>
> For more information on Apache Tika, visit the project home page:
>
> http://tika.apache.org/
>
> -- Chris Mattmann, on behalf of the Apache Tika community
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message