ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject Re: Accessing the External Resource from the UimaContext without Using XML descriptor [EXTERNAL]
Date Tue, 25 Jun 2019 16:25:46 GMT
Ah.

You are trying to use an old annotator.  It was never updated to be a uimafit component and
I think that it may not work with the PipelineBuilder.
Newer annotators have (for the most part) simpler interfaces and do not require explicit specification
of resources, resource types, etc.

You have several options (worst to best):
1.  Don't use PipelineBuilder
2.  Wrap the older annotator in a uimafit-compatible component
3.  Make a method that generates a description:   UmlsDictionaryLookupAnnotator does this
in a method named createAnnotatorDescription()
https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/ae/UmlsDictionaryLookupAnnotator.java
-- Create the description and use the PIpelineBuilder addDescription(..) method.
4.  Use the newer fast dictionary instead of the old one.
-- The basic equivalent of the old *CSV annotator is BsvRareWordDictionary.  It takes a single
parameter "bsvPath".  Instead of comma-separated values it wants Bar-separated values in the
format Cui|Synonym or Cui|Tui|Synonym
-- One misconception that people seem to have is that the "fast" dictionary is faster but
less accurate.  Actually, it is faster and more accurate.  Speed was the greater difference
and that name stuck.

There may be other solutions, but those are what come to mind right now.

Sean
________________________________________
From: Siamak Barzegar <barzegar.siamak@gmail.com>
Sent: Tuesday, June 25, 2019 11:46 AM
To: dev@ctakes.apache.org
Subject: Re: Accessing the External Resource from the UimaContext without Using XML descriptor
[EXTERNAL]

Thank Sean,

But it seems it is just fine for getting parameters, not external resources,
please see this file:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_ctakes_blob_ctakes-2D4.0.0_ctakes-2Ddictionary-2Dlookup_desc_analysis-5Fengine_DictionaryLookupAnnotatorCSV.xml&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=sZCB2_P5UuzUubmiDmngwj2ZLc19r7Zt7iktjHGEcgc&s=tG9OvH7quP0-I-MP8HPRtfBvDQqkeRregjq4WJPjgTU&e=

It has several externalResourceDependency that need to be run on
externalResource. How can I do it on the pipelinebiler? Do you any
suggestions?

>From Tutorial.ex6 from example UIMA:

"When the Analysis Engine is initialized, it creates a single instance of
StringMapResource_impl and loads it with the contents of the data file.
This means that the framework calls the instance's load method, passing it
an instance of DataResource, from which you can obtain a stream or URI/URL
of the external resource that was declared in the external resource..."

How can do the same for Resource Dependencies in
DictionalyLookuoAnnotatorCSV.xml?

With Best Wishes,
Siamak


On Tue, 25 Jun 2019 at 16:38, Finan, Sean <Sean.Finan@childrens.harvard.edu>
wrote:

> Hi Siamak,
>
> Good question.  Yet another shortfall in the documentation ...
>
> There are several ways to set parameters in the  PipelineBuilder.
>
> The javadocs for the 4.0.0 release version are here:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache.org_apidocs_4.0.0_&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=sZCB2_P5UuzUubmiDmngwj2ZLc19r7Zt7iktjHGEcgc&s=jGYZiAKr_MMmm78sUVP7kSfsRbN8pHf1ZSdDba4uk7Y&e=
>
> You can use the set(..) method to set "global" values, or place
> component-specific values using the add(..) method.
>
> The PipelineBuilder in trunk has the additional method:
> setIfEmpty(..)        Just like set(..) except any given attributes are
> ignored if they already have values
>
> In addition, the add( component, parameters... ) in trunk has been changed
> to:
> add( component, views, parameters ).
> Views are usually used for training ml models.  To use add(..) like the
> original (without special views) specify add( component,
> Collections.emptyList(), parameters ).   The method usage add( component )
> still exists.  Apparently I was too lazy to properly refactor the method
> with the original signature ...
>
> I hope that helps,
> Sean
>
> ________________________________________
> From: Siamak Barzegar <barzegar.siamak@gmail.com>
> Sent: Tuesday, June 25, 2019 9:23 AM
> To: dev@ctakes.apache.org
> Subject: Accessing the External Resource from the UimaContext without
> Using XML descriptor [EXTERNAL]
>
> I would like to use different cTAKES' components by using PipelineBuilder
> (exactly the same in HelloWorldBuilderRunner.java).
> But the problem is (As I understand it), PipelineBuilder does not read XML
> descriptor of the component. I want to use the Dictionary Lookup component
> (DictionaryLookupannotatorCSV.xml) in the following components:
>
>          PipelineBuilder builder = new PipelineBuilder();
>          builder
>               .add( SimpleSegmentAnnotator.class )
>               .add( SentenceDetector.class )
>               .add( TokenizerAnnotator.class )
>                // Java Class file of DictionaryLookupannotatorCSV.xml is:
>               .add(DictionaryLookupAnnotator.class);
>
> But in the DictionaryLookupannotatorCSV.xml file, there are several
> external resources that DictionaryLookupAnnotator needs to read them:
>
> public void initialize(UimaContext aContext) {
>   iv_context = aContext;
>    ....
>   FileResource fResrc = (FileResource)
> iv_context.getResourceObject("LookupDescriptor");
>     ...
>    iv_lookupSpecSet = LookupParseUtilities.parseDescriptor(descFile,
> iv_context);
> }
>
> So, what is the best way for having access to these
> resources(LookupDescriptorFile, DictionaryFileResource, RxnormIndex and
> OrangeBookIndex) in DictionaryLookupannotatorCSV.xml from the code?
>
> Thanks a lot.
> Siamak
>


--
Siamak Barzegar, PhD.
Senior Research Engineer.
Biomedical Text Mining Unit.
Barcelona Supercomputing Centre

Mime
View raw message