ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Nikandish <snika...@emerginghealthit.com>
Subject RE: Ctakes to process 5000K recoreds
Date Tue, 09 Sep 2014 19:01:58 GMT
I need the name of the medications for the application that I wrote and uses ctakes.....so
I cache the medication in DictionaryLookupAnnotator(in performLookup()) and use them in my
program but when I have SimpleSegementAnnotator it just takes forever. After taking SimpleSegementAnnotator
out, no medication name in DictionaryLookupAnnotator is returned in the code. So I was wondering
if there was a way that I could eliminate SimpleSegementAnnotator but still be  able to get
the medications name in that class?


-----Original Message-----
From: Pei Chen [mailto:chenpei@apache.org] 
Sent: Tuesday, September 09, 2014 2:54 PM
To: dev@ctakes.apache.org
Subject: Re: Ctakes to process 5000K recoreds

When you mean no medication is being annotated, I presume you mean the medication attributes
(i.e. dosage, frequency, etc.) are not being annotated?  I think the DrugNER needs a list
of section names in the config; I think it includes SIMPLE_SEGMENT.  I am very surprised that
SimpleSegementAnnotator is the bottle neck though; all it does is assume the entire document
is a single section called SIMPLE_SEGMENT.
Have you tried commenting out the DependencyParser if you're not using those features.


On Tue, Sep 9, 2014 at 2:45 PM, Nick Nikandish <snikandi@emerginghealthit.com> wrote:
> Hi there,
> I am using Ctakes to process 5000K free text  records  where each record has several
> This is the fixed flow that it goes through:
>                                                                <node>SimpleSegmentAnnotator</node>
>                                                                 <node>SentenceDetectorAnnotator</node>
>                                                                 <node>TokenizerAnnotator</node>
>                                                                 <node>LvgAnnotator</node>
>                                                                 <node>ContextDependentTokenizerAnnotator</node>
>                                                                 <node>POSTagger</node>
>                                                                 <node>Chunker</node>
>                                                                 <node>LookupWindowAnnotator</node>
>                                                                 <node>DictionaryLookupAnnotatorDB</node>
>                                                                 <node>DependencyParser</node>
>                                                                 <node>AssertionAnnotator</node>
> <node>ExtractionPrepAnnotator</node>
> But it takes very very long time to process that many data( maybe a week or so) when
I use SimpleSegmentAnnotator.  By eliminating SimpleSegmentAnnotator the process is very fast
but no medication is being anotated.  Do you guys have any suggestion?
> Thanks,
> Nick
View raw message