tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Third Tika report
Date Tue, 12 Jun 2007 16:48:06 GMT

This is my draft for the third Tika report. This report completest the
initial three-month period after which we will be reporting only once
per quarter.

I'm hoping that we could have our first release out by the next report
in September, but I guess it's safer not to set any expectations in
the official report at this point.

Tika is a toolkit for detecting and extracting metadata and structured
text content from various documents using existing parser libraries.
Tika entered incubation on March 22nd, 2007.


The Tika mailing lists have been relatively quiet lately, probably
because with little code we don't yet have many concrete issues to
talk about.


We saw the first piece of Tika code when Chris A. Mattmann ported the
Nutch metadata framework to Tika. Rida Benjelloun is currently working
on bringing Lius code into Tika but the initial commits on that front
have not yet happened.

Issues before graduation

The Tika project is still at an early stage of incubation. We need to
continue bringing in the initial codebases and probably target for an
initial incubating release later this year. We also need to work on
growing the community and figuring out how to best interact with
external parser projects.


Jukka Zitting

View raw message