ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Timothy" <Timothy.Mil...@childrens.harvard.edu>
Subject Re: How to use external CSV or BSV in addition to FastUMLS attention Sean [EXTERNAL]
Date Thu, 04 Jan 2018 13:42:14 GMT
The UIMA Analysis Engine descriptor for the dictionary component has a parameter for what ctakes
calls a "lookup descriptor". By default the lookup descriptor describes a lookup in a hsql
engine. The xml files in that sample directory are lookup descriptors for a lookup using the
bsv files they point to. If you want your bsv lookup to complement the default lookup it's
possible to just have two dictionaries running with different lookup descriptors. I think
it's also possible to have a lookup descriptor have multiple lookup types (i.e. multiple <dictionary>
sections inside <dictionaries>) but I can't guarantee that works!

From: Abramowitsch, Peter <pabramowitsch@hearst.com>
Sent: Thursday, January 4, 2018 7:51 AM
To: dev@ctakes.apache.org
Subject: Re: How to use external CSV or BSV in addition to FastUMLS  attention Sean [EXTERNAL]

Thanks Tim,

I did see that folder and its contents and it seemed the right place to
begin.  What I couldn't find was how/where to refer to one of those
CustomCuiTui.Xml files in an engine description.


On 1/4/18, 1:41 PM, "Miller, Timothy"
<Timothy.Miller@childrens.harvard.edu> wrote:

>Peter, I know Sean is busy this week and he may not see this for a while.
>But I tried this method over the summer and got it to work so I'm fairly
>confident that's the right approach still. Some of the details may have
>changed from two years ago, so I would also check out this directory as a
>starting point:
>From: Abramowitsch, Peter <pabramowitsch@hearst.com>
>Sent: Thursday, January 4, 2018 7:28 AM
>To: dev@ctakes.apache.org
>Subject: Re: How to use external CSV or BSV in addition to FastUMLS
>attention Sean [EXTERNAL]
>Further to my previous message, Sean, I was wondering if you could tell
>me whether this answer you gave in 2015, is still the right way to do
>things in ctakes4.x
>Subject:        RE: How to update cTAKES so that new top level categories
>come out based on local
>DD1ZdfsHVXO56wR8erA&e=>     [permalink]
>From:   Finan, Sean (Sean...@childrens.harvard.edu)
>Date:   Oct 6, 2015 2:04:56 pm
>List:   org.apache.incubator.ctakes-dev
>From: <Abramowitsch>, Peter Abramowitsch
>Date: Thursday, January 4, 2018 at 12:50 PM
>To: "dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>"
>Subject: How to use external CSV or BSV in addition to FastUMLS
>Can someone point me to any up-to-date how-tos on how to include external
>CSV/BSV type resources to add synonyms, and other terms for dictionary
>lookup to augment the FAST UMLS resources that comes out of the box.
>Perhaps I have missed something, but looking at the
>CTakesDictionaryCreator UI, it looks like it is designed only to choose
>subsets of the UMLS data set rather than allowing one to bring in
>completely new information sources.  I scoured the Marklogic ctakes user
>archive, but so many of the entries are old and I'm not sure they
>describe the current way of doing things.
>The only approach I could see would be to take use the AggregateEngine
>description and have it point to the CSV annotator, creating a completely
>new AE but this would build other types of annotation, whereas what I'm
>thinking about is a case for creating identified mentions such as a
>DiseaseDisorderMention based on finding an acronym that the UMLS resource
>doesn't know about, even though the concept in its full textual form is
>I'm sure this is not a unique request and apologize in advance if it has
>already been answered somewhere
>- Peter

View raw message