ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject Re: DefaultFastPipeline.piper and LVG Annotator [EXTERNAL]
Date Tue, 19 Feb 2019 17:12:29 GMT
Hi Jeff,

The short answer: No, LVG is not in the pipeline created by the DefaultFastPipeline.piper

Longer answer:
In older versions of dictionary lookup the Lexical Variant Generator module (LVG) was recommended
to capture lexical variants of terms.  However, the dictionary resource already contains variants
so the LVG module should not make much of a difference. When the fast lookup was new several
years ago I ran a test with and without LVG on two datasets and the difference was along the
lines of +1-2% recall, -1% precision.  

I think that ClinicalPipelineFactory.getFastPipeline() was a copy-paste of the previous .getClinicalPipeline()
but with the dictionary module replaced.  So, LVG is still in that method -created pipeline.

When I (more recently) wrote that piper file that you reference I left out LVG as the added
burden didn't seem to warrant its presence.  When I say burden I don't just mean speed decrease
and memory footprint.  There have been a lot of configuration problems with LVG on various
systems which led to difficulty using ctakes.

The diagram that you reference places LVG after the dictionary lookup, and after the part
of speech tagger, while the page on lvg https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+-+LVG
lists those as the two modules that may benefit from its presence.  That diagram is very old
and should definitely be updated.  Both the diagram and the page on lvg include information
that precedes (does not account for) the existence of the fast dictionary lookup.

Sean


________________________________________
From: Jeffrey Miller <jeffmax@gmail.com>
Sent: Tuesday, February 19, 2019 10:53 AM
To: dev@ctakes.apache.org
Subject: DefaultFastPipeline.piper and LVG Annotator [EXTERNAL]

Hi,

I was wondering if the LVG Annotator is included DefaultFastPipeline.piper
<https://urldefense.proofpoint.com/v2/url?u=https-3A__svn.apache.org_repos_asf_ctakes_trunk_ctakes-2Dclinical-2Dpipeline-2Dres_src_main_resources_org_apache_ctakes_clinical_pipeline_DefaultFastPipeline.piper&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=3Sgs1Jc-C37kcy1efCEhU_3RV4aFipAt1lbTO0Wu_Ns&e=>.
I have tried to trace through all the includes, but I cannot find it.
However, when I look at the code for the
ClinicalPipelineFactory.getFastPipeline() it seems to be included.
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_ctakes_blob_513bb49ebb98c4ac63f690c7b88a82aff18947b8_ctakes-2Dclinical-2Dpipeline_src_main_java_org_apache_ctakes_clinicalpipeline_ClinicalPipelineFactory.java-23L98&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=kmZDExXBOyXg84kix__UvgD3LniSHa8MgL8K5fK3XC4&e=>
From
documentation in this flow diagram
<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_download_attachments_68718172_ctakes-2D3.1-2Ddependencies.png-3Fversion-3D1-26modificationDate-3D1488992146000-26api-3Dv2&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=4yYVqkyLiodAWATji1EjSwoMh-YpU7qTz2J8tZvRT6I&e=>
from
the components documentation page
<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B4.0-2BComponent-2BUse-2BGuide&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=m-9MenhmNTr2vdVAhCvKgBt48OUiQB8R2TkR7fEYtsY&e=>,
it seems to be a recommended component for the dictionary annotator.

Thanks for your help,
Jeff

Mime
View raw message