uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Burn Lewis" <burnle...@gmail.com>
Subject Re: UIMA chunking
Date Wed, 23 Jul 2008 12:23:14 GMT

No, you cannot use a CM directly in a CPE .... but you can wrap it in an
aggregate and use that in a CPE.  The CPE could consist of CollectionReader
+ Aggregate + CasConsumer, where the Aggregate has the splitting CM +
Annotators + merging CM.  (For a CPE the aggregate must have outputsNewCASes
= false.)  Or you could put all of these into an aggregate and run as a
single AE, but you wouldn't have the error handling provided by the CPE.
The almost-released new UIMA-AS provides error handling as well as scaleout
that could allow parallel processing of your document segments and so
improve throughput.

If the size of the merged CAS is of concern, you may be able to do some
consuming before the merge, since there is nothing special about
CasConsumers.  If there are no downstream analytics that need the full
document you could omit the 2nd CM and let the aggregate end with a
CasConsumer, discarding the segmented CASes, returning just the input CAS to
the CPE which would not need a CC.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message