uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: Can we update the version of Tika in the TikaAnnotator to 0.4
Date Wed, 26 Aug 2009 16:51:12 GMT
Hi,

On Tue, Aug 25, 2009 at 3:24 AM, Marshall Schor<msa@schor.com> wrote:
> However, I notice that there are no test cases for this annotator, and
> also that there is another tika artifact at the 0.4 level, called
> tika-parsers.  Is this other artifact needed?  If so, how does it need
> to be incorporated?

The tika-core jar contains only the core client-visible classes and
interfaces and has zero dependencies beyond Java 5. All the actual
parser implementations and external parser dependencies are in the
tika-parsers jar. This split is new in Tika 0.4 and was done to better
support users who only need the core functionality.

For UIMA, I suppose you'll want support for all the document types, so
the correct dependency settings would be:

  <dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-core</artifactId>
    <version>0.4</version>
  </dependency>
  <dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-parsers</artifactId>
    <version>0.4</version>
  </dependency>

See the Maven section in
http://lucene.apache.org/tika/gettingstarted.html for the full
details.

BR,

Jukka Zitting

Mime
View raw message