tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Mattmann <chris.mattm...@jpl.nasa.gov>
Subject Re: Second Tika report
Date Wed, 09 May 2007 17:49:23 GMT
+1, thanks for putting this together, Jukka.

I plan on moving over the parse plugins stuff and the metadata container
sometime this month into the Tika codebase, where it can be maintained.


On 5/9/07 9:05 AM, "Doug Cutting" <cutting@apache.org> wrote:

> +1 This jibes with the activity I've seen.  Thanks for writing this!
> Doug
> Jukka Zitting wrote:
>> Hi,
>> I've prepared the following as the Tika report for this month.
>> <report>
>> Tika is a toolkit for detecting and extracting metadata and structured
>> text content from various documents using existing parser libraries.
>> Tika entered incubation on March 22nd, 2007.
>> Community
>> We had a good project bootstrap meeting as a part of the text analysis
>> BOF at the ApacheCon EU in Amsterdam. The resulting ideas were
>> summarized on the project mailing list, and the first design threads
>> have started.
>> Development
>> We've started discussing the design of the Tika toolkit. It seems like
>> we will select one of the existing codebases listed in the project
>> proposal as the basis of an early 0.1 release, and start refactoring
>> the code into a more generic toolkit. The Tika svn tree is still
>> empty, but I expect us to see the first code commits before the next
>> report.
>> Infrastructure
>> All the initial infrastructure is now in place. There is still some
>> activity on the temporary Tika wiki on the Google Project hosting
>> service, so we may end up requesting a Tika wiki to be set up on the
>> ASF infrastructure.
>> Issues before graduation
>> The Tika project is still at an early stage of incubation. The most
>> important tasks before graduation are to develop and release the Tika
>> codebase and to grow a diverse and sustainable project community.
>> </report>
>> BR,
>> Jukka Zitting

Chris A. Mattmann
Key Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.

View raw message