uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <eck...@ukp.informatik.tu-darmstadt.de>
Subject Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created
Date Fri, 15 Jun 2012 21:36:11 GMT
Am 11.06.2012 um 20:11 schrieb Eddie Epstein:

> Can you be a bit more explicit what the failing scenario is?

Take a scenario where you need want to access the CASes produced by an aggregate pipeline
directly - no CAS consumer, but you want to use a reader to fill the CASes (this is what is
implemented in the demo below).

Now add the need for sofa mapping to that scenario, because you want to run a complex analysis.
The collection reader is not sofa aware, but you do want it to write to some view "A" instead
of writing to the "_initialView", because "A" is what the next component will process. This
is possible now, because in the AnalysisEngineDescription I can declare sofa mappings for
the reader. However, I would get an exception due to UIMA-2419.

> I'm definitely confused by wrapping a CR in an AE descriptor. Is it
> possible to paste here an aggregate descriptor using sample components
> from the UIMA SDK that demonstrates the problem?

So here is the demo of wrapping a CR in an AE - no sofa mappings here because they would cause
an exception. The SimpleReader
creates a single CAS and set the text, the SimpleAnalyzer additionally sets the document language.
It's a very basic example.
The full runnable sources are at


 * Demo of disguising a reader as a CAS multiplier. This works because internally, UIMA wraps
 * the reader in a CollectionReaderAdapter. This nice thing about this is, that in principle
 * it would be possible to define sofa mappings. However, UIMA-2419 prevents this.
public void demoAggregateWithDisguisedReader() throws UIMAException {
  ResourceSpecifierFactory factory = UIMAFramework.getResourceSpecifierFactory();
  AnalysisEngineDescription reader = factory.createAnalysisEngineDescription();

  AnalysisEngineDescription analyzer = factory.createAnalysisEngineDescription();

  FixedFlow flow = factory.createFixedFlow();
  flow.setFixedFlow(new String[] { "reader", "analyzer" });

  AnalysisEngineDescription aggregate = factory.createAnalysisEngineDescription();
  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("reader", reader);
  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("analyzer", analyzer);

  AnalysisEngine pipeline = UIMAFramework.produceAnalysisEngine(aggregate);
  CasIterator iterator = pipeline.processAndOutputNewCASes(pipeline.newCAS());
  while (iterator.hasNext()) {
    CAS cas = iterator.next();
    System.out.printf("[%s] is [%s]%n", cas.getDocumentText(), cas.getDocumentLanguage());

-- Richard

Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universit├Ąt Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de

View raw message