uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rico Landefeld <Rico.Landef...@uni-jena.de>
Subject Lucas - Lucene CAS Indexer release
Date Wed, 21 Jan 2009 16:57:05 GMT
Dear UIMA Developers ans Users,

The JULIE Lab is happy to announce the release of Lucas 0.5 - a UIMA CAS
consumer component which writes CAS data into a Lucene index.

At the heart for the user is a flexible XML-based "mapping configuration
file" in which the user can determine which UIMA annotations should be
put into which Lucene field, and how this field is set up (e.g.
indexed and/or stored). In addition, some basic functionality for
(ontolgical) hypernym indexing is provided.

Additionally, Lucas is able to perform offset-based token stream
alignment and merging of UIMA annotations (via token position increment)
in the same Lucene field (e.g. "documenttext" or "title").

Lucas, along with the documentation, sources and a sample mapping
file, is available at:

Since this is a project which tries to bridge two Apache projects (UIMA
and Lucene), we would like to submit it to
the UIMA Sandbox, in order to solicit further development by the UIMA
What steps do we have to take in order to start this process? As far as
we know, the sandbox candidate has to undergo
a voting process on uima-dev list.

Please test the component and report any bugs or suggestions for
improvement back to us.

Best regards,
Rico Landefeld
Joachim Wermter

View raw message