uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Kottmann <kottm...@gmail.com>
Subject Re: Building the eclipse update site
Date Thu, 23 Jul 2009 10:22:01 GMT
Jörn Kottmann wrote:
>
>> A collection of text documents that you can run
>> analysis on.  If I understand correctly, the Cas
>> Editor currently requires XCAS/XmiCAS files.  It
>> would be nice if users could just add their text
>> files and then either create annotations manually
>> with the Cas Editor, or automatically by running
>> some analysis and then view the results using the
>> Cas Editor.  Then we could add results comparison
>> etc.  See
>> http://dl.alphaworks.ibm.com/technologies/tap/text_analysis_perspective.pdf 
>>
>> for a (outdated) description of what we have
>> in-house.  It's geared more towards a business user
>> than a developer, but the ideas of document collections
>> and the development cycle are equally applicable.
>> If there was enough interest here, I think that
>> would be a good direction to go in.
>>   
> Yes for me it sounds like the right way.
> We could also use it for debugging an AE, then
> a user defines a debug configuration and adds
> the collection as document source.
How would you define the format of a document collection ?

To open a CAS document the document itself and a type system
for the document is needed.

In the Cas Editor right now an Input Collection is a Corpus folder which 
contains xmi/xcas files
in one directory together with the project type system the files can be 
loaded by UIMA. Though
it has be criticized for not allowing sub directories for structuring 
its documents.

Jörn

Mime
View raw message