tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Confusion
Date Mon, 01 Dec 2014 02:29:15 GMT
Hi Peter,

[moving webmaster@apache.org to BCC]

Thanks for your question. You’ll have to subscribe to the Tika list or
check a mailing archive for Tika to see the reply to this. My suggestion
is to subscribe to the Tika list by sending a blank email to
and following the instructions from there. Some replies below:

-----Original Message-----
From: Peter Hodges <phodges@id.iit.edu>
Date: Sunday, November 30, 2014 at 7:56 AM
To: "webmaster@apache.org" <webmaster@apache.org>
Subject: Confusion

>I'm sure this is not the appropriate email but one must start someplace.
>I'd like to try Tika for manipulating text.
>However, despite the labels "getting started" etc in your online
>I find the directions confusing and hard to understand.
>As a designer and a non inner circle software/programmer expert
>I'd like to see a simple example:
>1) Evidently Tika requires Maven.

If you’d like to build Tika, yes. If you’d like to simply use
Tika in an application, try out the tika-app jar on the downloads
page. The tika-app.jar can be invoked with a Java runtime by typing
java -jar tika-app-X.Y.jar --help

(where X.Y is the version number, e.g., 1.6).

>Do these codes then go in the same directory (e.g., usr on Linux)?

If you want to build Tika, a good recipe is e.g., on Linux, is:
[with Maven3.x installed]
[with Java 1.6.x or higher installed]

1. mkdir $HOME/src
2. cd $HOME/src
3. svn co http://svn.apache.org/repos/asf/tika/trunk tika
4. cd tika
5. export MAVEN_OPTS=“-Xms128m -Xmx256m”
6. mvn install

(wait a while)

7. inside of tika-app/target - you will find the tika-app JAR file

>2) After extraction how does one execute a simple example?
>I followed the Tika directory structure down through four or five levels
>to find the parsing example. This appears to be java code.
>I return to the online getting started but find line after line of code
>(is this java? Python? or ?)



(especially at the bottom)

You can also use Tika as a REST server, e.g., here:


>The literature contains many papers about HCI, user research,
>participatory design, and other topics related to human centered design.
>These are powerful open source tools. It would be helpful to engaging a
>wider community to have some simple, clear directions about how to enter
>into using them.

I’m not sure of your comment here - how does literature relate to Tika -
what literature?

Hope that helps with some of the answers.


Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

View raw message