ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pei Chen <chen...@apache.org>
Subject Re: Ctakes to process 5000K recoreds
Date Tue, 09 Sep 2014 18:54:19 GMT
When you mean no medication is being annotated, I presume you mean the
medication attributes (i.e. dosage, frequency, etc.) are not being
annotated?  I think the DrugNER needs a list of section names in the
config; I think it includes SIMPLE_SEGMENT.  I am very surprised that
SimpleSegementAnnotator is the bottle neck though; all it does is
assume the entire document is a single section called SIMPLE_SEGMENT.
Have you tried commenting out the DependencyParser if you're not using
those features.


On Tue, Sep 9, 2014 at 2:45 PM, Nick Nikandish
<snikandi@emerginghealthit.com> wrote:
> Hi there,
> I am using Ctakes to process 5000K free text  records  where each record has several
> This is the fixed flow that it goes through:
>                                                                <node>SimpleSegmentAnnotator</node>
>                                                                 <node>SentenceDetectorAnnotator</node>
>                                                                 <node>TokenizerAnnotator</node>
>                                                                 <node>LvgAnnotator</node>
>                                                                 <node>ContextDependentTokenizerAnnotator</node>
>                                                                 <node>POSTagger</node>
>                                                                 <node>Chunker</node>
>                                                                 <node>LookupWindowAnnotator</node>
>                                                                 <node>DictionaryLookupAnnotatorDB</node>
>                                                                 <node>DependencyParser</node>
>                                                                 <node>AssertionAnnotator</node>
>                                                                 <node>ExtractionPrepAnnotator</node>
> But it takes very very long time to process that many data( maybe a week or so) when
I use SimpleSegmentAnnotator.  By eliminating SimpleSegmentAnnotator the process is very fast
but no medication is being anotated.  Do you guys have any suggestion?
> Thanks,
> Nick

View raw message