uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eddie Epstein <eaepst...@gmail.com>
Subject Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created
Date Sun, 17 Jun 2012 16:11:00 GMT
Richard,

Non-default views are currently created by application code, not by
the framework. The absence of an expected view is more clearly
diagnostic than the highly varied errors that would come if the
framework automatically created a view.

Sofa mapping is intended to solve your scenario by having the CR fill
the default _IntialView and then mapping view A to the _InitialView
for the analyzer. When analyzer asks for view(A) it would get
_InitialView.

Did you try this?

Eddie


On Fri, Jun 15, 2012 at 5:36 PM, Richard Eckart de Castilho
<eckart@ukp.informatik.tu-darmstadt.de> wrote:
> Am 11.06.2012 um 20:11 schrieb Eddie Epstein:
>
>> Can you be a bit more explicit what the failing scenario is?
>
> Take a scenario where you need want to access the CASes produced by an aggregate pipeline
directly - no CAS consumer, but you want to use a reader to fill the CASes (this is what is
implemented in the demo below).
>
> Now add the need for sofa mapping to that scenario, because you want to run a complex
analysis. The collection reader is not sofa aware, but you do want it to write to some view
"A" instead of writing to the "_initialView", because "A" is what the next component will
process. This is possible now, because in the AnalysisEngineDescription I can declare sofa
mappings for the reader. However, I would get an exception due to UIMA-2419.
>
>> I'm definitely confused by wrapping a CR in an AE descriptor. Is it
>> possible to paste here an aggregate descriptor using sample components
>> from the UIMA SDK that demonstrates the problem?
>
> So here is the demo of wrapping a CR in an AE - no sofa mappings here because they would
cause an exception. The SimpleReader
> creates a single CAS and set the text, the SimpleAnalyzer additionally sets the document
language. It's a very basic example.
> The full runnable sources are at
>
> http://code.google.com/p/uimafit/source/browse/trunk/uimaFIT/src/test/java/org/uimafit/factory/AggregateWithReaderTest.java
>
> /**
>  * Demo of disguising a reader as a CAS multiplier. This works because internally, UIMA
wraps
>  * the reader in a CollectionReaderAdapter. This nice thing about this is, that in principle
>  * it would be possible to define sofa mappings. However, UIMA-2419 prevents this.
>  */
> @Test
> public void demoAggregateWithDisguisedReader() throws UIMAException {
>  ResourceSpecifierFactory factory = UIMAFramework.getResourceSpecifierFactory();
>
>  AnalysisEngineDescription reader = factory.createAnalysisEngineDescription();
>  reader.getMetaData().setName("reader");
>  reader.setPrimitive(true);
>  reader.setImplementationName(SimpleReader.class.getName());
>  reader.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>
>  AnalysisEngineDescription analyzer = factory.createAnalysisEngineDescription();
>  analyzer.getMetaData().setName("analyzer");
>  analyzer.setPrimitive(true);
>  analyzer.setImplementationName(SimpleAnalyzer.class.getName());
>
>  FixedFlow flow = factory.createFixedFlow();
>  flow.setFixedFlow(new String[] { "reader", "analyzer" });
>
>  AnalysisEngineDescription aggregate = factory.createAnalysisEngineDescription();
>  aggregate.getMetaData().setName("aggregate");
>  aggregate.setPrimitive(false);
>  aggregate.getAnalysisEngineMetaData().setFlowConstraints(flow);
>  aggregate.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>  aggregate.getAnalysisEngineMetaData().getOperationalProperties()
>      .setMultipleDeploymentAllowed(false);
>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("reader", reader);
>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("analyzer", analyzer);
>
>  AnalysisEngine pipeline = UIMAFramework.produceAnalysisEngine(aggregate);
>  CasIterator iterator = pipeline.processAndOutputNewCASes(pipeline.newCAS());
>  while (iterator.hasNext()) {
>    CAS cas = iterator.next();
>    System.out.printf("[%s] is [%s]%n", cas.getDocumentText(), cas.getDocumentLanguage());
>  }
> }
>
> -- Richard
>
> --
> -------------------------------------------------------------------
> Richard Eckart de Castilho
> Technical Lead
> Ubiquitous Knowledge Processing Lab (UKP-TUD)
> FB 20 Computer Science Department
> Technische Universität Darmstadt
> Hochschulstr. 10, D-64289 Darmstadt, Germany
> phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
> eckart@ukp.informatik.tu-darmstadt.de
> www.ukp.tu-darmstadt.de
> Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
> -------------------------------------------------------------------
>
>
>
>
>
>

Mime
View raw message